The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
As the capacity of integrated circuits increases, it is becoming increasingly difficult to ensure that a chip is free of design errors. Designers are increasingly turning to FPGA prototyping platforms to validate their designs much more extensively than is possible using simulation. A key challenge is one of visibility; signals can only be observed if they can be driven to pins of a chip. To enhance...
Conference proceedings front matter may contain various advertisements, welcome messages, committee or program information, and other miscellaneous conference information. This may in some cases also include the cover art, table of contents, copyright statements, title-page or half title-pages, blank pages, venue maps or other general information relating to the conference that was part of the original...
Cluster-based logic blocks from most commercial FPGA products do not have an input bandwidth constraint, i.e., limiting the number of signals going from routing channels into the block. We show that high quality packing for such logic blocks can be easily achieved based on k-way partitioning. We implemented 2 such packing tools: PPack (routability-only) and its timing driven version TPPack. Experimental...
We propose a neural network based approach for estimating the total wirelength of a digital circuit, mapped onto an FPGA, before circuit placement and routing. A 3-layer MLP neural network is trained to learn the behavior of a placement tool and then quickly predicts the wirelength of a circuit design with the accuracy similar to one obtained after placement. A priori knowledge about the wirelength...
In this paper, we present HCS - Heterogeneous CRAM Scrubbing - for FPGAs. By utilizing stochastic fault modeling for SEUs in CRAM, we present a quantitative estimate of system MTTF improvement through CRAM scrubbing. HCS then leverages the fact that different SEUs have unequal effects on the circuit system operation, and thus the CRAM bits can be scrubbed at different rates based on the sensitivity...
This paper presents an FPGA based stereo vision system for future video tolling, which can achieve real-time processing for high resolution video streams. The key component for the system is SAD (Sum of Absolute Differences) based stereo matching. Although simple and effective, this method usually needs much computation power to satisfy real-time requirement. We propose a Hybrid-D Box-Filtering algorithm...
This paper proposes a framework targeting the problem of task-level out-of-order (OoO) execution for heterogeneous systems. The framework consists of three layers: 1) Programming model; 2) OoO task scheduler; 3) Processing Elements. In order to uncover task-level parallelism automatically, renaming scheme is applied from instruction-level parallelism (ILP) to task-level parallelism (TLP). With the...
Real-time optical mapping technology is a technique that can be used in cardiac disease study and treatment technology development to obtain accurate and comprehensive electrical activity over the entire heart. It provides a dense spatial electro-physiology. Each pixel essentially plays the role of a probe on that location of the heart. However, the high throughput nature of the computation causes...
Modern FPGAs have the ability to place many processing elements on a single die that can access shared memory. In a multiprocessing system, mutex variables are often used to provide proper synchronization and access to memory locations shared by the processing elements. This paper introduces a novel technique to manage mutex variables in caches for FPGAs, and is compared to an off-the-shelf system...
Due to their different cost structures, the architecture of switches for an FPGA packet-switched Network-on-a-Chip (NoC) should differ from their ASIC counterparts. The CONNECT network recently demonstrated several ways in which packet-switched FPGA NoCs should differ from ASIC NoCs. However, they also concluded that pipelining was not appropriate for the FPGA switches.We show that the Split-Merge...
Reconfigurable hybrid multi-processor systems-on-chips (MPSoCs) are very powerful computing platforms. However, it has been quite challenging to schedule and map tasks to different function units of the MPSoCs, especially for tasks with inter-task dependencies. This paper introduces a parallel dataflow execution support, called ReArc, for the FPGA based reconfigurable hybrid MPSoCs. It constructs...
The popularity of GPU programming languages that explicitly express thread-level parallelism leads to the question of whether they can also be used for programming reconfigurable accelerators. This paper describes Guppy, a GPU-like softcore processor based on the in-order LEON3 core. Our long-term vision is to have a unified programming paradigm for accelerators - regardless of whether they are FPGA...
Partial Reconfiguration (PR) is an advanced technique, which improves the flexibility of FPGAs by allowing portions of a design to be reconfigured at runtime by overwriting parts of the configuration memory. PR is an important enabler for implementing adaptive systems. However, the design of such systems can be challenging, and this is especially true of the configuration controller. The generally...
Coarse Grained Reconfigurable Architectures (CGRAs) have played a key role in the area of domain specific processors due to their programmability and runtime reconfigurability. The Coarse Grained Array (CGA) structure enables target designs to achieve high performance, but it is easy to fall into over-design in term of area. Moreover, the network overhead between the function units (FUs) seriously...
As larger System-on-Chip (SoC) designs are attempted on Field Programmable Gate Arrays (FPGAs), the need for a low cost and high performance Network-on-Chip (NoC) grows. Virtual Channel (VC) routers provide desirable traits for an NoC such as higher throughput and deadlock prevention but at significant resource cost when implemented on an FPGA. This paper presents an FPGA specific optimization to...
With rising demands for high-performance computing and design flexibility of post-fabrication system, reconfigurable architecture has been drawing increasing attentions. However, reconfigurability, advantage of current Field-Programmable Gate Array (FPGA), is severely limited by small capacity of on-chip Static Random Access Memory (SRAM) for storing configuration bits. With emerging high-density...
Static power consumption is an important component of the total power consumption in FPGAs built using 90nm and smaller technology nodes. A previous study proposed powering down regions of logic blocks in an FPGA when idle to reduce the static power dissipation. This previous work did not consider powering down the switch blocks (SBs). However, the static power of SBs constitute more than 50% of an...
Incorporating Networks-on-Chip (NoC) within FPGAs has the potential not only to improve the efficiency of the interconnect, but also to increase designer productivity and reduce compile time by raising the abstraction level of communication. By comparing NoC components on FPGAs and ASICs we quantify the efficiency gap between the two platforms and use the results to understand the design tradeoffs...
Multicore architectures, especially hardware accelerator systems with heterogeneous processing elements, are being increasingly used due to the increasing processing demand of modern digital systems. However, data communication in multicore architectures is one of the main performance bottle-necks. Therefore, reducing data communication overhead is an important method to improve the speed-up of such...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.