The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
As the density of field-programmable gate arrays continues to increase, the size of configuration bitstreams grows accordingly. Compression techniques can reduce memory size and save external memory bandwidth. To accelerate the configuration process and reduce the software startup time, four open-source lossless compression decoders developed using high-level synthesis techniques are presented. Moreover,...
A high-throughput architecture of the CCSDS 122.0-B-1 image compression standard is proposed. The architecture uses a novel memory organization in order to reduce the total memory operations and the number of the individual memories allowing operation without external memories. The architecture has been implemented on space grade and commercial FPGA Device. It achieves 136 MSamples/sec on space grade...
Hash functions represent a fundamental building block of many network security protocols. The SHA-3 hashing algorithm is the most recently developed hash function, and the most secure. Implementation of the SHA-3 hashing algorithm in Hardware Description Language (HDL) is time demanding and tedious to debug. On the other hand, High-Level Synthesis (HLS) tools offer potential solutions to the hardware...
This work is focused on FPGA based implementations of the SHA-3 hash functions. The existing literature classifies the existing implementations according to the adopted structural optimization techniques, namely: folding, pipelining and unrolling. Several structures have been proposed in the state-of-the-art, which vary mainly in the level of folding and the number of pipeline stages. While unfolded...
Streaming processing is an important technology that finds applications in networking, multimedia, signal processing, etc. However, it is very challenging to design and implement streaming applications as they impose complex constraints. First, the tasks involved in the streaming applications must complete the computation under a latency constraint. Second, streaming systems are built under more and...
Erasure coding, Reed-Solomon coding in particular, is a key technique to deal with failures in scale-out storage systems. However, due to the algorithmic complexity, the performance overhead of erasure coding can become a significant bottleneck in storage systems attempting to meet service level agreements (SLAs). Previous work has mainly leveraged SIMD (single-instruction multiple-data) instruction...
In this paper, we propose a novel design for large-scale graph processing on FPGA. Our design uses large external memory for storing massive graph data and FPGA for acceleration, and leverages edge-centric computing principles. We propose a data layout which optimizes the external memory performance and leads to an efficient memory activation schedule to reduce on-chip memory power consumption. Further,...
In recent years, high-level languages and compilers, such as OpenCL have improved both productivity and FPGA adoption on a wider scale. One of the challenges in the design of high-performance stream FPGA applications is iterative manual optimization of the numerous application buffers (e.g., arrays, FIFOs and scratch-pads). First, to achieve the desired throughput, the programmer faces the burden...
Image processing algorithms which only work on a local neighbourhood are nearly used in every image processing application. Very often several iterations are performed on a fixed neighbourhood which leads to the description of stencil codes. A promising approach in embedded systems is to use the massively parallel computation power of an FPGA for this kind of algorithms. This not only speeds up processing...
Due to the rapid growth of Internet, there is an increasing need for efficiently classifying packets with many header fields in large rule sets. For example, in Software Defined Networking (SDN), the OpenFlow table lookup can require 15 packet header fields to be examined. In this paper, we present several decomposition-based packet classiffication implementations with efficient optimization techniques...
Energy efficiency is a key design metric when implementing signal processing applications on FPGAs. In this paper, high level energy optimizations are proposed to facilitate the development of an energy efficient throughput-oriented FFT design. At the algorithm mapping level, we develop a data remapping technique and a memory activation scheduling method to reduce memory energy consumption. At the...
Morphological operation constitutes one of a powerful and versatile image and video applications applied to a wide range of domains, from object recognition, to feature extraction and to moving objects detection in computer vision where real-time and high-performance are required. However, the throughput of morphological operation is constrained by the convolutional characteristic. In this paper,...
Hash functions are widely used to check whether the data are correctly transferred. Keccak is an important hash function that was selected as SHA-3 in 2012. In this paper, we propose and evaluate the optimized FPGA implementations of Keccak for multi-message hashing. Our optimizations include a variety of pipeline organizations, retiming of a part of the calculation, and the use of DSP units. According...
Testing and verifying wireless systems in a real world environments is a challenging but an important problem. This is particular true for the Joint Tactical Radio System (JTRS) where the modulation techniques are optimized towards environments that are difficult to reproduce (e.g., ship to plane, plane to satellite communications). Such cases necessitate a wireless channel emulator to facilitate...
Frequently, applications such as image and video processing rely on implementations of the Linear Projection algorithm with high throughput and low latency requirements. This work presents a framework to optimise Linear Projection designs that excel typical design implementations via a pre-characterisation of over-clocked arithmetic units. It is well known that the delay models used by synthesis tools...
We propose a novel approach to the computation of the CRC functions, commonly used for bit error checking purposes when handling binary data. This approach is designed for general hashing purposes in FPGA, for which the CRCs are usable as well. The method is suitable for applications which use parallel inputs of fixed size and require high throughput, such as hash tables. We employ the DSP blocks...
Field Programmable Gate Arrays (FPGAs) satisfying the abundant parallelism and high operating frequency demands are the most promising platform to realize SRAM-based pipelined architectures for high-speed packet classification. Due to the restrictions of the state-of-the-art FPGAs on the number of I/O pins and on-chip memory, larger filter databases can hardly be accommodated by the current approaches...
Reconfigurable devices are often employed in heterogeneous systems due to their low power and parallel processing advantages. An important usability requirement is the support of a homogeneous programming interface. Nevertheless, homogeneous programming interfaces do not eliminate the need for code tweaking to enable efficient mapping of the computation across heterogeneous architectures. In this...
FPGA-based accelerators have repeatedly demonstrated superior speed-ups on an ever-widening spectrum of applications. However, their use remains beyond the reach of traditionally trained applications code developers because of the complexity of their programming tool-chain. Compilers for high-level languages targeting FPGAs have to bridge a huge abstraction gap between two divergent computational...
The testing, verification and evaluation of wireless systems is an important but challenging endeavor. The most realistic method to test a wireless system is a field deployment. Unfortunately, this is not only expensive but also time consuming. In this paper, we present the design and implementation of a digital wireless channel emulator, which connects directly to a number of radios, and mimics the...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.