The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Employing an on-chip network in a manycore system (to improve scalability) makes the latencies of data accesses issued by a core non-uniform, which significant impact application performance. This paper presents a compiler strategy which involves exposing architecture information to the compiler to enable optimized computation-to-core mapping. Our scheme takes into account the relative positions of...
Observing that large multithreaded applications with irregular data access patterns exhibit very low memory bank-level parallelism (BLP) during their execution, we propose a novel loop iteration scheduling strategy built upon the inspector-executor paradigm. A unique characteristic of this strategy is that it considers both bank-level parallelism (from an inter-core perspective) and bank reuse (from...
The memory performance of data mining applications became crucial due to increasing dataset sizes and multi-level cache hierarchies. Decision tree learning is one of the most important algorithms in this field, and numerous researchers worked on improving the accuracy of model tree as well as enhancing the overall performance of the learning process. Most modern applications that employ decision tree...
The energy concerns of many-core processors are increasing with the number of cores. We provide a new method that reduces energy consumption of an application on many-core processors by identifying unique segments to apply dynamic voltage and frequency scaling (DVFS). Our method, phase-based voltage and frequency scaling (PVFS), hinges on the identification of phases, i.e., Segments of code with unique...
Targeting network-on-chip based manycores, we propose a novel compiler framework to optimize the network latencies experienced by off-chip data accesses in reaching the target memory controllers. Our framework consists of two main components: data access placement and computation placement. In the data access placement, we separate the data access nodes from the computation nodes, with the goal of...
The challenges of the Power Wall manifest in mobile and embedded processors due to their inherent thermal and formfactor constraints. The power dissipated over a fixed area, namely, the power density, directly affects acceptable core temperatures even for low-power devices. In this paper, we examine techniques to counter this power density increase with device and microarchitecture-level heterogeneity...
Both on-chip resource contention and off-chip latencies have a significant impact on memory requests in large-scale chip multiprocessors. We propose a memory-side prefetcher, which brings data on-chip from DRAM, but does not proactively further push this data to the cores/caches. Sitting close to memory, it avails close knowledge of DRAM state and memory channels to leverage DRAM row buffer locality...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.