Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
Due to limits in technology scaling, energy efficiency of logic devices is decreasing in successive generations. To provide continued performance improvements without increasing power, regardless of the sequential or parallel nature of the application, microarchitectural energy efficiency must improve. We propose Dynamically Specialized Datapaths to improve the energy efficiency of general purpose...
In recent years, we have witnessed a growing interest in optimizing the parallel and distributed computing solutions using scaled-out hardware designs and scalable parallel programming paradigms. This interest is driven by the fact that the microchip technology is gradually reaching its physical limitations in terms of heat dissipation and power consumption. Therefore and as an extension to Moore's...
In this paper, we propose an energy-efficient 3D-stacked CMP design by both temporally and spatially finegrained tuning of processor cores and caches. In particular, temporally fine-grained DVFS is employed by each core and L2 cache to reduce the dynamic energy consumption, while spatially fine-grained DVS is applied to the cache hierarchy for the leakage energy reduction. Our tuning technique is...
This paper presents an adaptable softcore chip multiprocessor (CMP). The processor instruction set architecture (ISA) is based on the VEX ISA. The issue-width of the processor can be adjusted at run-time (before an application starts). The processor has eight 2-issue cores that can run independently from each other. If not in use, each core can be taken to a lower power mode by gating off its source...
This paper presents a strategy to speed-up the simulation of processors having SIMD extensions using dynamic binary translation. The idea is simple: benefit from the SIMD instructions of the host processor that is running the simulation. The realization is unfortunately not easy, as the nature of all but the simplest SIMD instructions is very different from a manufacturer to an other. To solve this...
In recent years, chip multiprocessors (CMP) have emerged as a solution for high-speed computing demands. However, power dissipation in CMPs can be high if numerous cores are simultaneously active. Dynamic voltage and frequency scaling (DVFS) is widely used to reduce the active power, but its effectiveness and cost depends on the granularity at which it is applied. Per-core DVFS allows the greatest...
In this paper, we propose a technique for custom instruction (CI) extension considering process variations. It bridges the gap between the high level custom instruction extension and chip fabrication in nanotechnologies. In the proposed method, instead of using the conventional static timing analysis (STA), statistical static timing analysis (SSTA) which in turn results in a probabilistic approach...
Supply voltage fluctuation caused by inductive noises has become a critical problem in microprocessor design. A voltage emergency occurs when supply voltage variation exceeds the acceptable voltage margin, jeopardizing the microprocessor reliability. Existing techniques assume all voltage emergencies would definitely lead to incorrect program execution and prudently activate rollbacks or flushes to...
The SHA-3 competition organized by NIST has triggered significant efforts in performance evaluation of cryptographic hardware and software. These benchmarks are used to compare the implementation efficiency of competing hash candidates. However, such benchmarks test the algorithm in an ideal setting, and they ignore the effects of system integration. In this contribution, we analyze the performance...
Continuous shrinking in feature size, increasing power density etc. increase the vulnerability of microprocessors against soft errors even in terrestrial applications. The register file is one of the essential architectural components where soft errors can be very mischievous because errors may rapidly spread from there throughout the whole system. Thus, register files are recognized as one of the...
The growing popularity of mobile internet services, characterized by heavy network transmission, intensive computation and an always-on display, poses a great challenge to the battery lifetime of mobile devices. To manage the power consumption in an efficient way, it is essential to understand how the power is consumed at the system level and to be able to estimate the power consumption during runtime...
Intermittent hardware faults are bursts of errors that last from a few CPU cycles to a few seconds. Recent studies have shown that intermittent fault rates are increasing due to technology scaling and are likely to be a significant concern in future systems. We study the impact of intermittent hardware faults in programs. A simulation-based fault-injection campaign shows that the majority of the intermittent...
Reducing power consumption has become a priority in microprocessor design as more devices become mobile and as the density and speed of components lead to power dissipation issues. Power allocation strategies for individual components within a chip are being researched to determine optimal configurations to balance power and performance. Modelling and estimation tools are necessary in order to understand...
Many new designs for Decimal Floating Point (DFP) hardware units have been proposed in the last few years. To date, only the IBM POWER6 and POWER7 processors include internal units for decimal floating point processing. We have designed and tested several DFP units including an adder, multiplier, divider, square root, and fused-multiply-add compliant with the IEEE 754-2008 standard. This paper presents...
Multi-core and many-core Systems-on-Chip (SoC) are growing more complex than ever. Consequently, developing system models for such SoCs to guide and validate architectural and implementation decisions is becoming a daunting task. It consumes a huge amount of time and effort just to get the model up and running. Although these system models can be fairly abstracted, they still require the setup of...
In recent years, we have witnessed a growing interest in optimizing the high performance computing (HPC) solutions using advanced CPU and Interconnect technologies. These advances are driven by the fact that CPU manufacturers are facing extreme challenges in doubling the processors' speeds due to various reasons, such as the high temperatures and power consumptions. Therefore and as a continuation...
This paper evaluates the scalability with respect to processor cores of a three-dimensional sonar beamforming kernel implemented on a multi-core workstation. Beamforming is an example of an extremely parallelizable problem. This implementation is instrumented with OpenMP to exploit multi-core computer systems. However, when executed on a 16-core machine, this kernel scales much less than expected...
Exploiting the locality of blocks in the same set, LRU replacement strategy has deficiencies to manage L2 cache resources as the temporal locality has filtered by L1 caches. Instead, reuse replacement strategy develops the reuse characteristics of blocks in entire cache scope being more potential to improve cache resources utilization. We use reuse replacement to manage L2 cache resources in chip...
3D integration increases chip integration density, reduces wire length, wire delay and power consumption on wires. However, increased power density causes increase of temperature, which results in the increase of leakage power consumption and performance degradation. In this paper, we propose a solution to compromise these issues on stacking cache memory on a processor core. Experimental results have...
This paper proposes a compiler-based solution to the problem of inserting power gating instructions into code to control activation/deactivation (i.e., ON/OFF) of functional units in microprocessor during the code execution, so that the leakage power is maximally saved. Precisely, based on an execution profile of code containing conditional branches and/or loops, we propose a polynomial time optimal...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.