The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Despite being employed in burgeoning efforts to improve power delivery efficiency, integrated voltage regulators (IVRs) have yet to be evaluated in a rigorous, systematic, or quantitative manner. To fulfill this need, we present Ivory, a high-level design space exploration tool capable of providing accurate conversion efficiency, static performance characteristics, and dynamic transient responses...
Recent research has demonstrated promising results in solving constrained satisfaction problem (CSP) using D-Wave quantum annealer. However, the embedding of the CSP suffers drawbacks such as long embedding time in addition to poor quality due to long chains that reduce the ground state probability. To address those issues, we propose an effective embedding technique that reduces the embedding time...
The Pauli frame mechanism allows Pauli gates to be tracked in classical electronics and can relax the timing constraints for error syndrome measurement and error decoding. When building a quantum computer, such a mechanism may be beneficial, and the goal of this paper is not only to study the working principles of a Pauli frame but also to quantify its potential effect on the logical error rate. To...
A wide variety of error tolerant applications supports the use of approximate circuits that achieve power savings by introducing small errors. This paper proposes a fast and novel algorithm for the design of such circuits with the goal of maximizing power savings, constrained by a fixed error budget, through an analytical expression to optimally select the number of bits to be approximated. This algorithm...
Timing resilient design has shown significant promise in mitigating the excess margins associated with rare worst-case data and increased process, voltage, and temperature (PVT) variations. However, resilient circuits need error detecting sequential logic (EDL) to detect timing errors which represents area and power overhead. This article proposes a new network-simplex-based retiming method for two-phase...
Asynchronous methodologies are gaining their presence in modern integrated circuit design. Cycle-time analysis of asynchronous design is nontrivial and crucial to circuit optimization. Among prior methods, linear programming-based analysis (LPA) and static performance analysis (SPA) are two representatives with high accuracy (but inefficient) and high efficiency (but inaccurate), respectively. However,...
Long Short-Term Memory (LSTM) based Recurrent Neural Networks (RNNs) are promising for cognitive intelligence applications like speech recognition, image caption and nature language processing, etc. However, the cascade dependent structure in RNN with huge amount of power inefficient operations like multiplication, memory accessing and nonlinear transformation, could not guarantee high computing speed...
The ever-increasing design complexity is driving the need of fast and accurate macro-modeling algorithms to accelerate the hierarchical timing. We introduce LibAbs, an effective macro-modeling algorithm that efficiently supports high accuracy, high compression rate, and multi-threading. LibAbs applies tree-based graph reduction techniques to reduce the model size with comparable accuracy values to...
Convolutional neural networks (CNNs) require high computation and memory demand for training. This paper presents the design of a frequency-domain accelerator for energy-efficient CNN training. With Fourier representations of parameters, we replace convolutions with simpler pointwise multiplications. To eliminate the Fourier transforms at every layer, we train the network entirely in the frequency...
There are an increasing number of neuromorphic hardware platforms designed to efficiently support neural network inference tasks. However, many applications contain structured processing in addition to classification. Being able to map both neural network classification and structured computation onto the same platform is appealing from a system design perspective. In this paper, we perform a case...
The next generation video coding standard High Efficiency Video Coding (HEVC) provides better compression rate for high resolution videos, at the cost of substantially higher computational complexity. While some latest off-the-shelf consumer electronics support HEVC via ASIC solutions, software implementation of real-time HEVC remains an open challenge for resource-constraint embedded systems. In...
Current and future applications impose high demands on software-defined radio (SDR) platforms in terms of latency, reliability, and flexibility. This paper presents a heterogeneous SDR MPSoC with a hexagonal network-on-chip to address these issues. It features four data processing modules and a baseband processing engine for iterative multiple-input multiple-output (MIMO) receiving. Integrated memory...
Non-volatile random-access memory (NVRAM) becomes a mainstream storage device in embedded systems due to its favorable features, such as small size, low power consumption, and short read/write latency. On NVRAM, a write operation consumes more energy and time than a read operation. However, current mobile/embedded file systems (e.g., EXT2/3 and EXT4) are very unfriendly for NVRAM devices. The reason...
Metal-oxide resistive random access memories with a single memristor device at the crosspoint (1R RRAM) is a promising alternative to next generation storage technology due to their high density, scalability, non-volatility and low power consumption. However, the imperfect fabrication process introduces high defect rates of the nanoscale memristor devices and leads to yield degradation. In addition,...
An ideal solution for soft error tolerance should hide the effect of soft errors from user and provide correct results at expected time. Software solutions are attractive because they can provide flexible reliability without imposing any hardware modifications. Our investigation of state-of-the-art error recovery techniques reveals that they suffer from poor coverage (ability to detect and correctly...
Embedded systems often use digital filtering after analog-to-digital conversion when signal conditioning is required. However, digital filters are computationally demanding, making them unsuitable for low-power microcontrollers. In this paper, we propose a fast and energy-efficient digital filtering technique based on look-up tables. The novelty in our technique is the use of multiple small LUTs,...
For more than two decades, the key objective for synthesis of linear decompressors has been maximizing encoding efficiency. For combinational decompressors, encoding satisfiability is dynamically checked for each specified care bit. By contrast, for sequential linear decompressors (e.g. PRPGs), encoding is performed for each test cube; the resultant static encoding considers that a test cube is encodable...
With FPGAs emerging as a promising accelerator for general-purpose computing, there is a strong demand to make them accessible to software developers. Recent advances in OpenCL compilers for FPGAs pave the way for synthesizing FPGA hardware from OpenCL kernel code. To enable broader adoption of this paradigm, significant challenges remain. This paper presents our efforts in developing dynamic profiling...
Quality control plays a key role in approximate computing to save the energy and guarantee that the quality of the computation outcome satisfies users' requirement. Previous works proposed a hybrid architecture, composed of a classifier for error prediction and an approximate accelerator for approximate computing using well trained neural-networks. Only inputs predicted to meet the quality are executed...
On-chip communication is the bottleneck of system performance for NoC-based MPSoCs. SMART, a recently proposed NoC architecture, enables single-cycle multi-hop communications. In SMART NoCs, unconflicted messages can go through an express bypass and the communication efficiency is significantly improved, while conflicted messages have to be buffered for guaranteed delivery with extra delays. Therefore,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.