The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Broadcast penalty has become an important concern for the emerging many-core architectures. In large-scale Networks-on-Chip (NoC), broadcast taking up a small portion of total traffic will dramatically congest the network. Therefore the NoC architecture of many-core processors must be carefully designed to reduce broadcast penalty. A mainstream solution is to partition the NoC into several regions,...
Single-node computation speed is essential in large-scale parallel solutions of particle transport problems. The Intel Many Integrated Core (MIC) architecture supports more than 200 hardware threads as well as 512-bit double precision float-point vector operations. In this paper, we use the native model of MIC in the parallelization of the simulation of one energy group time-independent deterministic...
We present a hardware architecture for efficient implementation of a Gaussian random number generator (GRNG), using the Monty Python method. To maximize the performance/complexity efficiency, an efficient word-length optimization model is proposed to find out both the optimal integer and fractional word-lengths for signals. Experimental results show that our optimized Fixed-Point design achieves a...
In this paper, we introduce a scalable macro-pipelined architecture to perform floating point matrix multiplication, which aims to exploit temporal parallelism and architectural scalability. We demonstrate the functionality of the hardware design with 16 processing elements (PEs) on Xilinx ML507 development board containing Virtex-5 XC5VFX70T. A 32-PE design for matrix size ranging from 32*32 to 1024*1024...
The stream architecture is a novel microprocessor architecture with wide application potential. It is critical to study how to use the stream architecture to accelerate scientific computing programs. However, existing stream processors and stream programming languages are not designed for scientific computing. To address this issue, we design and implement a 64-bit stream processor, Fei Teng 64 (FT64),...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.