The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Error detection and correction based on double-sampling is used as common technique to handle timing errors while scaling Vdd for energy efficiency. An additional sampling element is inserted in the critical paths of the design, to double sample the outputs of those logic paths at different time instances that may fail while scaling the supply voltage or the clock frequency of the design. However,...
We have developed Representation and Implementation of Temporal Event Sequences (RITES), a domain specific programming language which enables users to precisely specify how software and firmware execute with respect to real time on embedded systems hardware. Precise timing of software and firmware operations is critical to many functions within a space system, including coordination of hardware components,...
FPGAs are increasingly being used to implement many new applications, including pipelined processor designs. Designers often employ memories to communicate and pass data between these pipeline stages. However, one-cycle communication between sender and receiver is often required. To implement this read-immediately-after-write functionality, bypass registers are needed by most FPGA memory blocks. Read...
With the over-provisioned routing resource on FPGA, the topology choice for NoC implementation on FPGA is more flexible than on ASIC. However, it is well understood that the global wire routing impacts the performance of NoC on FPGA because the topology is routed by using fixed routing fabric. An important question that arises is: will the benefit of diameter reduction by using a highly connective...
This paper presents a reliable processor pipeline architecture resilient to multiple soft- and timing errors. It also presents a probabilistic quantification of its performance overheads. This reliable processor pipeline architecture has been implemented in the Leon3 VHDL open source processor. An FPGA prototype running under random fault injection has also been developed. This reliable processor...
Modern processor architectures sacrifice timing predictability to improve average performance. Branch prediction, out-of-order execution, and multi-level cache hierarchies complicate accurate execution time estimates. The timing demands of Cyber Physical Systems (CPS) have led some to propose new processor architectures, including Precision Timed (PRET) processors, which simplify analysis of execution...
This paper introduces a novel synchronous to asynchronous logic conversion tool targeted specifically for a synchronous field programmable gate array (FPGA). This tool augments the synchronous FPGA design flow and removes the clock network to implement an asynchronous control network in its place. We evaluate the timing performance benefits of the methods used to implement the asynchronous control...
Flexible Application Specific Instruction set Processors (ASIP) are starting to replace monolithic ASICs in a wide variety of fields. However the construction of an ASIP is today associated with a substantial design effort. NoGap (Novel Generator of Micro Architecture and Processor) is a tool for ASIP designs, utilizing hardware multiplexed data paths. One of the main advantages of NoGap compared...
We present RAMP Gold, an economical FPGA-based architecture simulator that allows rapid early design-space exploration of manycore systems. The RAMP Gold prototype is a high-throughput, cycle-accurate full-system simulator that runs on a single Xilinx Virtex-5 FPGA board, and which simulates a 64-core shared-memory target machine capable of booting real operating systems. To improve FPGA implementation...
Detailed architectural simulators suffer from a long development cycle and extremely long evaluation times. This longstanding problem is further exacerbated in the multi-core processor era. Existing solutions address the simulation problem by either sampling the simulated instruction stream or by mapping the simulation models on FPGAs; these approaches achieve substantial simulation speedups while...
Static and dynamic variations, which have negative impact on the reliability of microelectronic systems, increase with smaller CMOS technology. Thus, further downscaling is only profitable if the costs in terms of area, energy and delay for reliability keep within limits. Therefore, the traditional worst case design methodology will become infeasible. Future architectures have to be error resilient,...
The accumulation operation Anew = Aold + X is required for many numerical methods. However, when using a floating-point adder with pipeline latency ??, the data hazard that exists between Anew and Aold creates design challenges for situations where inputs must be delivered to the accumulator at a rate exceeding 1/??. Each of the techniques proposed to address this problem requires either static data...
This paper presents a new method to implement the butterfly unit of FFT processor, based on the coordinate transformation and CORDIC algorithm. The butterfly unit uses less resource and reaches higher speed, also can be configured by parameter and satisfied by the real time operations. Finally, it can work on the frequency of 50 MHz after the design is downloaded to a chip EP2C20F484C7.
The hardware implementation of a high speed floating point multiplier with pipeline architecture based on FPGA is presented in the paper. In the design of the floating point multiplier, the utilization of a new radix-4 booth's encoding algorithm, the improved 4:2 compression structure and summation circuit is made to implement the compression of the partial products, and the sum and carry vectors...
Modern processors are becoming more complex and as features and application size increase, their evaluation is becoming more time-consuming. To date, design space exploration relies on extensive use of software simulation that when highly accurate is slow. In this paper we propose ReSim, a parameterizable ILP processor simulation acceleration engine based on reconfigurable hardware. We describe ReSim's...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.