The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The testing, verification and evaluation of wireless systems is an important but challenging endeavor. The most realistic method to test a wireless system is a field deployment. Unfortunately, this is not only expensive but also time consuming. In this paper, we present the design and implementation of a digital wireless channel emulator, which connects directly to a number of radios, and mimics the...
This paper presents a low-power MMSE MIMO detector using a dynamic voltage wordlength scaling (DVWS) technique for 4×4 MIMO-OFDM systems. A MIMO MMSE detector requests high performance computing for real-time processing. A pipelined MMSE detector using Strassen's algorithm has been developed in our previous work. However, it consumes significant power and thus, we have to consider a low-power solution...
Circuit degradation due to bias temperature instability (BTI) can lead to timing failures in digital circuits. We develop variable latency unit (VLU) based BTI-aware designs, with a novel scheme for multioutput hold logic implementation for VLUs. A key observation is the identification and exploitation of specific supersetting patterns in the two-dimensional space of frequency and aging of the circuit...
Operating in the near-threshold regime can result in significant energy savings. Unfortunately, the increased timing variation prevents conventional error-detection techniques from properly functioning. This paper introduces two circuit-level timing error detection techniques that aim to increase throughput while operating in the near-threshold voltage regime: current-sensing completion detection...
Communication wire delay between multiple blocks is becoming a critical issue in System on Chip (SoC) design. Scheduling-based Latency-Insensitive Design (LID) is a method to alleviate wire delays by utilizing a central scheduling scheme for periodic clock gating of the blocks. The scheduling scheme resides in shift registers as sequences of ‘1’ and ‘0’ bits. In many systems, these sequences are too...
Hardware implementations of Internet Protocol (IP) classification algorithms have been proposed by the research community over the years to realize high speed routers and Internet backbone. Decomposition-based IP classification algorithms are desirable for hardware implementation due to their parallel search on multiple fields. These algorithms consist of two phases: independent searches on each packet...
Demands for high performance are growing rapidly and multiple processor cores and huge caches are required to meet these requirements. 3D integration provides us a very bright option to encounter this by integrating numerous cores and cache layers in a single chip. Temperature however becomes a problem in 3D integration due to increased power density. A methodology to exploit maximum performance while...
Voltage and frequency scaling (VFS) for NoC can potentially reduce energy consumption, but the associated increase in latency and degradation in throughput limits its deployment. We propose flexible-pipeline routers that reconfigure pipeline stages upon VFS, so that latency through such routers remains constant. With minimal hardware overhead, the deployment of such routers allows us to reduce network...
High performance implementation of 2D digital filters are highly desired in many applications for real-time processing. In this paper, a multiprocessor realization of a 2D denominator separable digital filter is implemented in Altera Stratix III FPGA. The implementation achieves a data throughput equivalent to one multiplication and two additions, plus one clock cycle. It has been found that the maximum...
As variable delays are observed in the integrated circuits under different data inputs, it is expected to enhance the performance of the circuit using the average-case design methodology. This paper presents a novel approach using the time-domain multistage speculation to realize a variable-latency circuit, in which speculation points with double-sampling and check-recovery units are inserted into...
We examine the energy consumption of a digital circuit with voltage scaling and observe its impact on the energy efficiency of the battery. We study the system with a power source under throughput constraints and we propose a method to find a right size of battery to satisfy given system requirements. For systems with limit on battery weight or volume, we suggest a right circuit voltage operating...
This paper describes an approach to pipelining in high-level synthesis that modifies the control/data flow graph before and after scheduling. This enables the direct re-use of a pre-existing, timing- and area-aware non-pipelined simultaneous scheduler and binder. Such an approach ensures that the RTL output can be synthesized within the given timing and area constraints. Results from real industrial...
In this tutorial, a 45nm resilient microprocessor core with error-detection and recovery circuits demonstrates the opportunity for improving performance and energy efficiency by mitigating the impact of dynamic parameter variations. The design methodology describes the additional steps beyond a standard design flow for integrating error-detection and recovery circuits into a microprocessor core. Silicon...
The unstable/unpredictable LSI operation caused by the PVT (Process Voltage Parameter) variations, along with the aging effect such as NBTI/PBTI, is one of the serious issues in current and future scaled LSIs. In these situations, where operation environments in the field are hard to predict at the stages of circuit design and test, the conventional approach of the margin-based design and test in...
A common approach to protect confidential information is to use a stream cipher which combines plain text bits with a pseudo-random bit sequence. Among the existing stream ciphers, Non-Linear Feedback Shift Register (NLFSR)-based ones provide the best trade-off between cryptographic security and hardware efficiency. In this paper, we show how to further improve the hardware efficiency of the Grain...
This paper describes a high-speed low-power subranging Flash ADC designed in 90nm Mixed-Mode CMOS process. The maximum speed of subranging-ADC is limited by the time taken for the fine-ADC reference to settle. The proposed method splits optimally the total time taken for the coarse-ADC and fine-ADC comparisons to achieve the maximum possible clock speed. An auxiliary track-and-hold has been used in...
The mapping of high level applications onto the coarse grained reconfigurable architectures (CGRA) are usually performed manually by using graphical tools or when automatic compilation is used, some restrictions are imposed to the high level code. Since high level applications do not contain parallelism explicitly, mapping the application directly to CGRA is very difficult. In this paper, we present...
This paper presents a new methodology of multiplierless implementation of inner-product computation. The inner-product computation is decomposed to form an architecture that facilitates an efficient serial accumulation of the 1's in the partial product matrix of each multiplication of a pair of elements from the input vectors. The 1's that appear at each partial product position are accumulated by...
Synchronous elastic circuits help synchronous designs tolerate computation or communication latencies, in a way similar to the asynchronous design style. The datapath is made elastic by turning registers into elastic buffers and adding a control layer that uses handshake signals and join/fork controllers. Join elements are the objective of two improvements discussed in this paper. The first one is...
A 45 nm 1.3 GHz microprocessor core employs error-detection circuits, tunable replica circuits, and error-recovery circuits, to mitigate dynamic variation guardbands for maximum throughput. An adaptive clock controller adjusts the frequency based on error statistics to optimize efficiency. Silicon measurements show resilient operation as well as throughput gains of 12 to 16% at 1.0 V and 22 to 23%...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.