The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Fractional interpolation is one of the most computationally intensive parts of High Efficiency Video Coding (HEVC). Therefore, in this paper, two pixel correlation based computation and energy reduction techniques for HEVC fractional interpolation are proposed. The proposed pixel equality based computation reduction (PECR) technique does not affect the PSNR and bit-rate. The proposed pixel similarity...
Polar codes are the first class of forward error correction (FEC) codes with a provably capacity-achieving capability. Using list successive cancellation decoding (LSCD) with a large list size, the error correction performance of polar codes exceeds other well-known FEC codes. However, the hardware complexity of LSCD rapidly increases with the list size, which incurs high usage of the resources on...
FPGAs are being incorporated into contemporary datacenters in order to improve computational capacity, power consumption, and processing latency. Efficiently integrating FP-GAs in datacenters is, however, quite challenging. Ideally, smaller tasks could share a device and the cloud management layer would be able to partially reconfigure the device to allocate its free resources to incoming tasks. Moreover,...
Research tools targeting commercial FPGAs have most commonly been based on the Xilinx Design Language (XDL). Vivado, however, does not support XDL, preventing similar tools from being created for next-generation devices. Instead, Vivado includes a Tcl interface that exposes Xilinx's internal design and device data structures. Considerable challenges still remain to users attempting to leverage this...
Implementing elliptic curve point multiplication (ECPM) based on residue number system (RNS) can efficiently use FPGA resources. In this paper, we propose a modular reduction method, where a kind of RNS pair is selected to achieve fast reduction. Our reduction method mainly needs several parallel additions while the reduction unit of previous designs require two multiplications which are computed...
In this paper, we describe an FPGA system for the real-time processing of Poisson image Editing. Poisson Image Editing is a powerful method to overlay an image on another image seamlessly. In this method, however, a simple equation is repeatedly applied to each pixel, and this repetition makes its computational complexity very high. In our system, a very deep pipeline is used to apply the equation...
Important design considerations for the cost-effective employment of hardware accelerators in next-generation data centers involve a) the type of candidate applications that a proposed solution can accelerate (generality), and b) the required development effort to successfully deploy the available accelerators for a given application (adoption overhead). To address the problem of generality, we present...
High-level synthesis (HLS) is well capable of generating control and computation circuits for FPGA accelerators, but still requires sufficient human effort to tackle the challenge of memory and communication bottlenecks. One important approach for improving data locality is to apply loop tiling on memory-intensive loops. Loop tiling is a well-known compiler technique that partitions the iteration...
FPGAs are becoming an attractive choice as a heterogeneous computing unit for scientific computing because FPGA vendors are adding floating-point-optimized architectures to their product lines. Additionally, high-level synthesis (HLS) tools such as Altera OpenCL SDK are emerging, which could potentially break the FPGA programming wall and provide a streamlined flow for domain experts in scientific...
The Kiwi project revolves around a compiler that converts C# .NET bytecode into Verilog RTL and/or SystemC. An alpha version of the Kiwi toolchain is now open source and a user community is growing. We will demonstrate an incremental approach to large system assembly of HLS and blackbox components, based on an extended IP-XACT intermediate representation. We show how to address multi-FPGA designs...
As an alternative of adding more and more instructions to CPU cores in order to address a wide range of applications, this paper examines to use a mixed grained CPU interlay fabric to provide reconfigurable instruction set extensions. In detail, we are examining to replace the hardened NEON SIMD unit of an ARM Cortex-A9 with an identical sized FPGA fabric. We show that by applying a set of optimizations,...
Security features of modern (SoC) FPGAs permit to protect the confidentiality of hard- and software IP when the devices are powered off as well as to validate the authenticity of IP when being loaded at startup. However, these approaches are insufficient since attackers with physical access can also perform attacks during runtime, demanding for additional security measures. In particular, RAM used...
Bufferless, deflection-routed, Butterfly Fat Trees (BFTs) can outperform state-of-the-art FPGAs overlay NoCs such as Hoplite by as much as 2–5× on throughput and ≈5× on worst-case latency at identical PE counts, and by ≈1.5× on throughput at identical resource costs >16K LUTs for statistical traffic patterns. In this paper, we show how to modify the tree connectivity and routing function to support...
Networks-on-chip (NoCs) have become a new chip design paradigm as the size of transistors continues to shrink. Globally-asynchronous locally-synchronous (GALS) on-chip networks are proposed for solving issues such as large clock tree distribution and signal delay variations. More interestingly, for the GALS networks using m-of-n delay-insensitive interconnect, the asynchronous interconnect not only...
FPGAs are promising candidates for computational tasks in space. However, they are susceptible to radiation-induced errors in their configuration memory. The recovery of configuration errors, either by device scrubbing or by module-based recovery, involves a series of reads and writes to the FPGA's configuration port, and is efficiently performed on-chip by a fast, flexible and reliable reconfiguration...
This paper proposes a new synthesizable oscillator-based temperature sensor with minimal footprint for use in contemporary Xilinx FPGA devices. In contrast to previously published ring-oscillator architectures, based on inverters mapped onto single LUTs, the proposed oscillator uses an asynchronous Gray-coded 4-bit counter requiring only two 6-input LUTs. Due to its reduced hardware requirements,...
This paper proposes a high throughput architecture for AES encryption/decryption targeting on the recent FPGAs with 6-input LUTs. Unlike previous works which share multiplicative inverse logics to realize SubBytes and InvSubBytes, the proposed architecture directly employs the look-up-table based Sbox for both SubBytes and InvSubBytes. Efficient reordering and merging techniques are applied to achieve...
The parallelism of hardware and the dynamic reconfigurability of FPGAs enable multiple hardware tasks to run concurrently, and also time-share resources by being swapped in and out of the device during runtime. More than ever before, these capabilities are being employed in systems with high-reliability requirements. To improve reliability, a method often used is circuit relocation. However, the static...
Iterative stencils are kernels in various application domains such as numerical simulations and medical imaging, that merit FPGA acceleration. The best architecture depends on many factors such as the target platform, off-chip memory bandwidth, problem size, and performance requirements. We generate a family of FPGA stencil accelerators targeting emerging System on Chip platforms, (e.g., Xilinx Zynq...
In Systems Biology, Boolean models are gaining popularity in modeling and analysis of qualitative dynamics of gene regulatory mechanisms. With the development of advanced high-throughput technologies, the availability of experimental data on gene-gene, gene-protein interactions is ever increasing. Consequently, modern Boolean models are increasing in size and complexity. Software simulation of Boolean...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.