The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this work, we present a modular software subsystem that exposes a set of APIs for supporting the automation of a set of design choices in the synthesis of a hardware accelerator by a proprietary FPGA toolchain. We model the subsystem around Vivado, Xilinx's proprietary FPGA toolchain, in order to provide a finer grained control on the toolchain's features with respect to the standard .tcl interface...
FPGA devices allows designer to implement complex digital architectures that involve hardware and software components. Because of the different features of hardware and software design, diverse mechanisms and tools have been proposed for debugging and verification of architectures implemented on FPGA devices. Bus level transactions and data processing algorithms are usually difficult to manage together...
The template matching is an important technique used in pattern recognition. It aims at finding a given pattern within a frame sequence. Pearson's Correlation Coefficient (PCC) is widely used to evaluate the similarity of two images. This coefficient is computed for each image pixel, which entails a computationally very expensive process. This paper proposes an implementation of the template matching...
Local triple modular redundancy (LTMR) is often the first choice to harden a flash-based FPGA application against soft errors in space. In this work, we compare parity-based error detection with software-based retry, and LTMR on a reference architecture regarding maximum frequency, area overhead and processing time. Our results show that our solution based on parity-based error-detection saves from...
This work proposes a reconfigurable system able to perform - through a parallel and pipelined core, called ReCPU - regular expression matching. The system can configure on the programmable device, such as a FPGA, a set of ReCPUs, each one exploiting a single instance of the regular expression matching task on the given input string. These cores work in parallel on the same string analyzing different...
The Fast Inverse Square Root algorithm has been used in 3D games of past for lighting and reflection calculations, because it offers up to four times performance gains. This paper presents a hardware implementation of the algorithm on an FPGA board by designing the complete architecture and successfully mapping it on Xilinx Spartan 3E after thorough functional verification. The results show that this...
Generation of device-unique digital signatures using Physically Unclonable Functions (PUFs) is an active area of research for the last decade. However, most PUFs are conceived and designed as stand-alone hardware modules. In contrast, this paper proposes a PUF architecture that is tightly integrated into the core of a system-on-chip (SoC), with the purpose of creating a physical SoC authentication...
The paper presents an optimized architecture of cascaded integrator-comb (CIC) digital filter structure. The structure is suitable for implementation in application specific integration circuits (ASICs) or field programmable gate arrays (FPGAs). The main advantages of the architecture are higher working frequency, smaller area size and lower power consumption. Software in C++ language was written...
Real-time embedded systems like smartphones tend to comprise an ever increasing number of processing cores. For scalability and the need for guaranteed performance, the use of a connection-oriented network-on-chip (NoC) is advocated. Furthermore, a distributed shared memory architecture is preferred as it simplifies software development for a multicore system. In this paper, experimental evidence...
An academic processor to be used in the “Computer Structure” subject has been developed in this work. During the lab sessions students will apply their knowledge about digital systems to design and implement this processor so they will interact with a real implementation of the system in several ways: modifying it to increase its functionality, programming it and watching its internal state while...
Connect6 is a new generation k-in-a-row game, which has drawn great interest not only from game enthusiasts, but also from researchers, due to its characteristics such as fairness and high state-space complexity. In this paper we describe the design and implementation of an FPGA-based Connect6 player that can compete against other computer-based opponents, communicating through a serial interface...
Partial reconfiguration (PR) enhances traditional FPGA-based system-on-chips (SoCs) by providing additional benefits such as reduced area and increased functionality as compared to non-PR SoCs. However, since leveraging these additional benefits requires specific designer expertise and increased development time, PR has not yet gained widespread usage. In this paper, we present an integrated development...
This work presents an architecture to compute matrix inversions in a hardware reconfigurable FPGA with single-precision floating-point representation, whose main unit is the processing component for Gauss-Jordan elimination. This component consists of other smaller arithmetic units, organized to maintain the accuracy of the results without the need to internally normalize and de-normalize the floating-point...
In the FPT 2010 International Conference an Othello competition has been announced, based on the popular game and with requirements for implementation of full designs on standardized FPGA platforms. This paper presents in detail the CarlOthello architecture and design, which is heavily pipelined in order to increase the expansion rate of the overall system, reaching a peak of 4×108 expansions per...
Modular multiplication of long integers is an important building block for cryptographic algorithms. Although several FPGA accelerators have been proposed for large modular multiplication, previous systems have been based on O(N2) algorithms. In this paper, we present a Montgomery multiplier that incorporates the more efficient Karatsuba algorithm which is O(N(log 3/log 2)). This system is parameterizable...
Solving systems of linear and nonlinear equations is of fundamental importance at the basic level of a vast array of science and engineering applications. As these applications become more computationally complex, the need for low cost, high performance computing methods increases. This paper discusses different approaches to reduce computation time to solve linear equations by using a hardware/software...
This paper presents a novel approach to exploit FPGA dynamic partial reconfiguration to improve the fault tolerance of complex microprocessor-based systems, with no need to statically reserve area to host redundant components. The proposed method not only improves the survivability of the system by allowing the online replacement of defective key parts of the processor, but also provides performance...
Sorting is an important operation for many embedded computing systems. Since sorting large datasets may slowdown the overall execution, schemes to speedup sorting operations are needed. Bearing in mind the hardware acceleration of sorting, we show in this paper an analysis and comparison among three hardware sorting units: sorting network, insertion sorting, and FIFO-based merge sorting. We focus...
Branch prediction is an important topic in modern computer architecture research. Predictors attempt to improve the performance of a processor with a reasonable hardware cost. In the last decade, many prediction schemes have been developed in order to achieve this objective, each of them with different cost/performance tradeoffs. Identifying the optimal predictor for a given architecture and set of...
Todaypsilas systems are more complex and need higher performance. To accomplish this, systems include more hardware compared to software. This increases the use of FPGAs in modern systems because of its reconfiguration capabilities. FPGA contains many hardware components, which are utilized to perform operations directly in hardware. There are two problems related to this issue, first is high performance...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.