The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Deep Convolutional Neural Network (CNN) based methods have shown outstanding performance in a wide range of applications. Nowadays neural networks become deeper, leading to demand of substantial computation and memory resources. Customized hardware is one option which maintains high performance in lower energy consume than general CPUs or GPUs. While hardware designing, we need to address the problem...
Advances in semiconductor technology have led to large chip multiprocessor (CMP) employing network-on-chip (NoC) to provide scalable on-chip communication. This higher integration capacity, on the other hand, increases the possibility of faults. To tackle this challenge, fault-tolerant routing in NoC becomes essential, which allows packets to be routed around faulty network components and maintains...
In a Chip MultiProcessor(CMP) with shared caches, the last level cache is distributed across all the cores. This increases the on-chip communication delay and thus influence the processor's performance. Replication can be provided in shared caches to reduce the on-chip communication delay. However, current proposals do not take into account replicating blocks's access characteristics and how to make...
We propose Proximity-Aware cache Replication (PAR), an LLC replication technique that elegantly integrates an intelligent cache replication placement mechanism and a hierarchical directory-based coherence protocol into one cost-effective and scalable design. PAR dynamically allocates replicas of either shared or private data to a few predefined and fixed locations that are calculated at chip design...
In a Chip Multiprocessor(CMP) with shared caches, the last level cache (LLC) is distributed across all the cores. This increases the on-chip communication delay and thus influence the pr ocessor's performance. The LLC is also quite inefficient due to plenty of dead blocks. Replication can be provided in shared caches by replicating cache blocks evicted from cores to the local LLC slices to minimize...
Previous research shows that LRU replacement policy is not efficient when applications exhibit a distant re-reference interval. Recently proposed RRIP policy improves performance for such workloads. However, RRIP lacks of access recency information, which may confuse the replacement policy to make accurate prediction. Consequently, RRIP is not robust for recency-friendly workloads. This paper proposes...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.