The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Due to the fall in the price of multicore processors, today's non-dedicated clusters tend to include this kind of hardware in their configurations. How general purpose Operating System (OS) schedulers will support requirements like the coexistence of soft-real time, best effort or interactive applications are open questions that need to be addressed carefully. For these reasons, new user interfaces,...
The shift to multicore processors demands efficient parallel programming on a diversity of architectures, including homogeneous and heterogeneous chip multiprocessors (CMPs). Task parallel programming is one approach that maps well to CMPs. In this model, the programmer focuses on identifying parallel tasks within an application, while a runtime system takes care of managing, scheduling, and balancing...
The possibility of connecting several nodes in a network of processors has popularized parallel programming in the scientific community, but its use has been limited by the difficulty of message-passing programming. With the arrival of multicore processors, parallel programming has regained popularity. The use of an OpenMP compiler optimized for the multicore system in question is a good option, but...
Putting performance asymmetric cores inside the same processor can be a good alternative to obtain high performance per area, throughput and single-threaded performance. However, the impact of running parallel applications on this type of machine is not clear, since most of previous work focused on multi-programmed and server workloads where there is low or no dependence between threads. In this work,...
In the last years high performance processor designs have evolved toward Chip-Multiprocessor (CMP) architectures that implement multiple processing cores on a single die. As the number of cores inside a CMP increases, the on-chip interconnection network will have significant impact on both overall performance and power consumption as previous studies have shown. On the other hand, CMP designs are...
Multicore processor architectures provide huge computation power by leveraging multiple levels of parallelism. However, it is non-trivial to orchestrate computational and memory resources allocation on the multicore platform. In this paper, we model the resources allocation for multicores as an optimization space, including variant selection, grouping and PE assignment. Finding efficient parallelization...
This work presents a study undertaken to characterise the behaviour of some parallelisation techniques for irregular codes, previously developed for SMP architectures, on a several-node SMP NUMA system. The main objective is to determine the performance effect of bus contention and cache coherency in such a complex architecture. Results show that: (1) cores which share a socket can be considered as...
Multicore nodes have become ubiquitous in just a few years. At the same time, writing portable parallel software for multicore nodes is extremely challenging. Widely available programming models such as OpenMP and Pthreads are not useful for devices such as graphics cards, and more flexible programming models such as RapidMind are only available commercially. OpenCL represents the first truly portable...
For a high-performance parallel implementation of many scientific algorithms, efficient realizations of combining communication patterns like reduce or all-reduce are important. Especially on the Cell Broadband Engine a low latency realization of such operations is not obvious. So in this paper several algorithms for implementing reductions are discussed and efficient implementations on the Cell are...
This paper proposes a strategy to organize metric-space query processing in multi-core search nodes as understood in the context of search engines running on clusters of computers. The strategy is applied in each search node to process all active queries visiting the node as part of their solution which, in general, for each query is computed from the contribution of each search node. When query traffic...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.