The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
NVIDIA CUDA and ATI Stream are the two major general-purpose GPU (GPGPU) computing technologies. We implemented RankBoost, a web relevance ranking algorithm, on both NVIDIA CUDA and ATI Stream platforms to accelerate the algorithm and illustrate the differences between these two technologies. It shows that the performances of GPU programs are highly dependent on the utilization of GPU's hardware memory...
Loops are predominant in computer programs. Data dependencies in loops dictate their execution time. An algorithm to execute loops in parallel based on the preordering of data in is presented in this paper. This algorithm can be applied to chip multiprocessors. The algorithm performs a fair share allocation of the loops to the available processors. Data accessed for loops accessed in a processor are...
A shared memory parallel k-NN algorithm for M-tree index structure is introduced in this paper, which is called SMP A-NN. The processing of the pending request (PR) queue is a core operation in the traditional k-NN query algorithm, which is also time-consuming. Therefore, we separate the long queue into multi-parts and assign them to different threads. This improvement takes full advantage of SMP...
Commonly represented as directed graphs, social networks depict relationships and behaviors among social entities such as people, groups, and organizations. Social network analysis denotes a class of mathematical and statistical methods designed to study and measure social networks. Beyond sociology, social network analysis methods are being applied to other types of data in other domains such as...
Several 64-processor XMT systems have now been shipped to customers and there have been 128-processor, 256-processor and 512-processor systems tested in Cray's development lab. We describe some techniques we have used for tuning performance in hopes that applications continued to scale on these larger systems. We discuss how the programmer must work with the XMT compiler to extract maximum parallelism...
Several optimization alternatives are presented for legacy Fortran 77 scientific programs, each one with a quantitative characterization in terms of performance gain. Initially, sequential optimization is focused on the analysis of Level 3 BLAS (basic linear algebra subroutines) utilization, since BLAS have several performance optimized implementations. Also, the Fortran 90/95 array notation is used...
Modern GPUs are massively parallel microprocessors that can deliver very high performance for the parallel computations common in science and engineering.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.