The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Many scientific or engineering applications perform reduction of sets of sequential data streams. If the core operator of the reduction is deeply pipelined, dependencies between the input data elements cause data hazards in the pipeline. To tackle this problem, we propose a multiple set variable length reduction design with low latency and high pipeline utilization in this paper. We prove the buffer...
Many scientific applications involve reduction or accumulation operations on sequential data streams. Examples such as matrix-vector multiplication include multiple inner product operations on different data sets. If the core operator of the reduction is deeply pipelined, which is usually the case, dependencies between the input data cause data hazards in the pipeline and ask for a proper design....
High-end parallel and multicore processors rely on compilers to perform the necessary optimizations and exploit concurrency in order to achieve higher performance. However, the source code for high-performance computers is extremely complex to analyze and optimize. In particular, program analysis techniques often do not take into account complex expressions during the data dependence analysis phase...
Accurate data dependence testing allows a compiler to perform safe automatic code optimization and parallelization. It has been shown that factors, such as loop variants and nonlinear expressions, limit opportunity for data dependence testing and parallelization. Recently, the NLVI-Test has been introduced as new technology to enable exact data dependence testing on nonlinear expressions. Apart from...
Matrix decomposition applications that involve large matrix operations can take advantage of the flexibility and adaptability of reconfigurable computing systems to improve performance. The benefits come from replication, which includes vertical replication and horizontal replication. If viewed on a space-time chart, vertical replication allows multiple computations executed in parallel, and horizontal...
QR decomposition, especially through the means of Householder transformation, is often used to solve least squares problems. A matrix to be decomposed with this method is usually very large, often large enough that it is not able to fit into the main memory of a workstation, let alone the internal memory of an FPGA nowadays. Efficient out-of-core algorithms have been developed to address the factorization...
Parallelizing Compilers rely upon subscript analysis to detect data dependences between pairs of array references inside loop nests. The most widely used approximate subscript analysis tests are the GCD test and the Banerjee test. In an earlier work we proposed the I test, an improved subscript analysis test. The I test extends the accuracy of a combination of the GCD test and the Banerjee test. It...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.