The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Cyber-Physical Systems (CPS) are tight integrations of computational and physical worlds for various kinds of applications. For example, a humanoid robot, which is a typical application of CPS, has required timing constraints, low-latency execution, and parallel processing to achieve fine-grained real-time execution. Therefore low-latency parallel real-time computing is an important factor for CPS...
Heterogeneous computing is a promising approach to tackle the thermal, power and energy constraints posed by modern desktop and embedded computing systems. However, by also allowing the migration of application threads to the most appropriate cores, significant performance gains and energy efficiency levels can also be attained. Nevertheless, the considerably large overheads usually imposed by software-based...
We use a functional framework designed for parallel programming with linear algebra applications to leverage the computing power of heterogeneous hardware. Our work is performed in the context of the pure functional programming language Haskell. The framework allows the manipulation of arbitrary representations for matrices and the definition of multiple implementations of BLAS operations based on...
To migrate complex sequential code to multicore, profiling is often used on sequential executions to find opportunities for parallelization. In non-scientific code, the potential parallelism often resides in while-loops rather than for-loops. The do-all model used in the past by many studies cannot detect this type of parallelism. A new, task-based model has been used by a number of recent studies...
The Blue Gene/Q machine is the next generation in the line of IBM massively parallel supercomputers, designed to scale to 262144 nodes and sixteen million threads. With each BG/Q node having 68 hardware threads, hybrid programming paradigms, which use message passing among nodes and multi-threading within nodes, are ideal and will enable applications to achieve high throughput on BG/Q. With such unprecedented...
Queues are commonly used in multithreaded programs for synchronization and communication. However, because software queues tend to be too expensive to support finegrained parallelism, hardware queues have been proposed to reduce overhead of communication between cores. Hardware queues require modifications to the processor core and need a custom interconnect. They also pose difficulties for the operating...
Summary form only given. The dynamic reconfiguration of hardware stands for the change of hardware while the system is operating. Its benefit is the adaption to different computing requirements. For instance, an improved use of communication networks can be achieved: Many networks reveal the characteristic that connections between specific communication partners show a smaller latency than others...
In conventional static implementations for correlated streaming applications, computing resources may be in-efficiently utilized since multiple stream processors may supply their sub-results at asynchronous rates for result correlation or synchronization. To enhance the resource utilization efficiency, we analyze multi-streaming models and implement an adaptive architecture based on FPGA Partial Reconfiguration...
One of benefit of coarse-grained dynamically reconfigurable processor arrays (DRPAs) is their low dynamic power consumption by operating a number of processing element (PE) in parallel with a low frequency clock. However, in the future advanced process, the leakage power will occupy a considerable part of the total power consumption, and it may degrade the advantage of DRPAs. In order to reduce the...
The architecture of modern computing systems is getting more and more parallel, in order to exploit more of the offered parallelism by applications and to increase the system's overall performance. This includes multiple cores per processor module, multi-threading techniques and the resurgence of interest in virtual machines. In spite of this amount of parallelism the network interface is typically...
A system for controlling smart sensor networks is described. The system is called the adaptive context information processing language (ACIPL) which will allow explicit use of states of context inferred from sensor readings and algorithmic output for distributed control of data fusion in sensor networks. The detailed description of the language including its use for sensor information separation into...
Most recent MPP systems employ a fast microprocessor surrounded by a shell of communication and synchronization logic. The CRAY-T3D provides an elaborate shell to support global-memory access, prefetch, atomic operations, barriers, and block transfers. We provide a detailed empirical performance characterization of these primitives using micro-benchmarks and evaluate their utility in compiling for...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.