The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Real-time vision applications place stringent performance requirements on embedded systems. To meet performance requirements, embedded systems often require hardware implementations. This approach is unfavorable as hardware development can be difficult to debug, time-consuming, and require extensive skill. This paper presents a case study of accelerating face detection, often part of a complex image...
Binary Content Addressable Memories (BCAMs), also known as associative memories, are hardware-based search engines. BCAMs employ a massively parallel exhaustive search of the entire memory space, and are capable of matching a specific data within a single cycle. Networking, memory management, pattern matching, data compression, DSP, and other applications utilize CAMs as single-cycle associative search...
Overlays are emerging as useful design patterns for solving reconfigurable computing problems. Overlays consist of compiler-like tools and an architecture written in RTL, making it easier for users to quickly compile high-level languages into FPGAs. Despite a high degree of regularity and repetition present in most overlays, it takes a long time for FPGA tools to generate the configuration bit stream...
Embedded systems frequently use FPGAs to perform highly parallel data processing tasks. However, building such a system usually requires specialized hardware design skills with VHDL or Verilog. Instead, this paper presents the VectorBlox MXP Matrix Processor, an FPGA-based soft processor capable of highly parallel execution. Programmed entirely in C, the MXP is capable of executing data-parallel software...
Throughput processing involves using many different contexts or threads to solve multiple problems or subproblems in parallel, where the size of the problem is large enough that latency can be tolerated. Bandwidth is required to support multiple concurrent executions, however, and utilizing multiple external memory channels is costly. For small working sets, FPGA designers can use on-chip BRAMs achieve...
Overclocking a CPU is a common practice among home-built PC enthusiasts where the CPU is operated at a higher frequency than its speed rating. This practice is unsafe because timing errors cannot be detected by modern CPUs and they can be practically undetectable by the end user. Using a timing speculation technique such as Razor, it is possible to detect timing errors in CPUs. To date, Razor has...
FPGAs are increasingly being used to implement many new applications, including pipelined processor designs. Designers often employ memories to communicate and pass data between these pipeline stages. However, one-cycle communication between sender and receiver is often required. To implement this read-immediately-after-write functionality, bypass registers are needed by most FPGA memory blocks. Read...
This paper presents the ZUMA open FPGA overlay architecture. It is an open-source, cross-compatible embedded FPGA architecture that is intended to overlay on top of an existing FPGA, in essence an "FPGA-on-an-FPGA." This approach has a number of benefits, including bit stream compatibility between different vendors and parts, compatibility with open FPGA tool flows, and the ability to embed...
SRAM-based Field-Programmable Gate Arrays (FPGAs) are configured from off-chip memory through a serial link. Hence, a large configuration bit stream adversely increases off-chip memory size as well as bit stream loading time. The following work proposes a novel method to reduce the number of programming bits required for look-up tables (LUT), thereby reducing overall configuration bit stream size...
As each generation of FPGAs grow in size, the run time of the associated CAD tools is rapidly increasing. Many past efforts have aimed at improving the CAD run time through parallelization of the placement algorithm. Wang and Lemieux presented an algorithm that is scalable, deterministic, timing-driven and achieves speedup over VPR [Wang and Lemieux FPGA'11]. This paper provides two significant alterations...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.