The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper proposes cost efficient very high throughput layered decoding architecture for array quasi-cyclic Low-Density Parity-Check (QC-LDPC) codes, targeting tens of Gbps data rates. The targeted throughput is achieved by employing layer unrolling, with pipeline stages inserted in between layers. In order to obtain improved hardware efficiency for the decoder, multiple codewords are processed simultaneously...
This paper proposes a layered decoder architecture for array QC-LDPC codes which targets tens of Gbps data rates. It relies on layer unrolling with pipeline stages in between layers, allowing simultaneous decoding of multiple layers. The most important features of the proposed decoder are: (i) fully parallel processing units within each layer (ii) hardwired layer interconnect that allows the removal...
In this paper, we perform a simulated fault injection reliability assessment of memory centric flooded LDPC decoders affected by probabilistic storage errors. We investigate the error correction capability in terms of Frame Error Rate (FER) of faulty flooded Min-Sum decoder, under Binary Additive White Gaussian Noise (BiAWGN) channel model. We have injected all the memories, as well as only the memories...
This paper proposes a QC-LDPC partial parallel architecture that implements a hard decision message passing algorithm based on Gallager-B decoding. The proposed architecture uses an optimized variable node unit, with adaptive threshold, suitable for irregular LDPC codes. We present implementation results for WiMAX rate 1/2 code for FPGA technology. These indicate a cost reduction of 2.5x in logic,...
This paper proposes multi-domain quantization for a partially-parallel QC-LDPC decoder, which allows efficient memory bandwidth utilization. The change of quantization during decoding is dynamic. In addition to this, the proposed decoder targets speedup and energy efficient processing, with negligible error correction capability degradation. From the message storage perspective, it uses 2 quantization...
This paper proposes a dual-clock based solution for QC-LDPC partially parallel flooded decoders. It aims at reducing memory and routing overhead, while maintaining a high degree of parallelism at processing node level. We take advantage of the high memory working frequencies with respect to the processing units, and use two clock domains: a high frequency for memory and the barrel shifter based routing...
In this paper, we present an LDPC decoder design equipped with an adaptive throughput mechanism achievable using a multiple quantization scheme. Three representations are supported by the proposed architecture: 1-bit (hard decision), 2-bit, and 4-bit messages. A throughput increase by of factor of 4, 2 and 1 can be achieved with respect to the 4-bit message representation version, by simultaneously...
Sorting represents one of the most important operations in data center applications. In this paper, we propose a hardware-software FPGA accelerated based solution for very large data set merge sorting. The accelerator is using a FIFO based approach for sorting. The main contributions of the proposed solution are: (i) configurable FIFO buffers in order to address the variable size of the pre-sorted...
Implementation of Quasi-Cyclic (QC) Low Density Parity-Check (LDPC) decoder on FPGA devices has shown great interest in both wireless communication, as well as error correction for Flash memories. This paper presents an FPGA flooded LDPC decoder which uses multiple codeword processing for efficient memory utilization. It is based on a partially parallel implementation, which relies on memory blocks...
This paper proposes two variants which aim at reducing the memory requirements of the self-corrected min-sum (SCMS) with respect to min-sum (MS). The first improvement—SCMS-V1—eliminates the need for check node messages’ signs storage. The second improvement—SCMS-V2—is based on a novel imprecise self-correction rule, which allows the reduction of the erasure bits. We analyze the decoding performance...
This paper presents an automated template-based approach for layered QC-LDPC architectures. The underlying idea is to be able to generate any number of fairly optimized hardware designs with only a minimum effort required by tuning high level parameters (parity check matrix, number of AP-LLR messages being processed at a time, etc.). Having chosen a partially parallel architecture, template design...
In this paper we perform a fault tolerance assessment of flooded Low Density Parity Code (LDPC) decoders affected by probabilistic timing errors, characteristic to sub-powered CMOS circuits. We investigate the error correction capability - in terms of Frame Error Rate (FER) - of faulty flooded Min-Sum (MS) and Self-Corrected Min-Sum (SCMS) LDPC architectures for both Binary Input Additive White Gaussian...
This paper proposes a methodology for timing error analysis of RTL circuit descriptions. The evaluation has three components: (i) statistical static timing analysis (SSTA) for standard cell components (ii) estimation based on probability density function (PDF) propagation for characterization of combinational blocks, and (iii) simulated fault injection (SFI) performed at RTL. Reliability characterization...
This paper proposes an FPGA based flooded architecture for quasi-cyclic (QC) LDPC decoder. The message computation for both check and variable node update is done using a parallel scheme of a number of processing units equal to the expansion factor of the QC matrix. The proposed architecture performs serial processing of the messages by dedicated check node and variable node processing units. This...
This paper presents an analysis of existing stopping criteria for layered architecture used for quasi-cyclic (QC) LDPC decoders. Furthermore, it proposes a novel imprecise method for early termination in layered decoders. The analysis is performed under the same framework in order to provide a fair and accurate comparison between existing methods, and our new solution. The developed hardware modules...
This paper addresses the problem of providing support for energy consumption accounting and performance evaluation by means of performance counters in an open source processor core — OpenRISC OR1200. The OpenRISC processing core is flexible in that it allows different hardware configurations, and provides full support on the tool-chain side. In addition to this, it gives full hardware design access,...
Serial based FPGA fault emulation schemes for probabilistic errors rely on a random number generator -- which is used for generation of fault bits - and a shift register - used for placing the fault bits to their corresponding fault location. It has two advantages with respect to parallel solutions: lower cost and better accuracy. The main disadvantage is represented by the high emulation overhead:...
This paper proposes memory efficient FPGA implementations for layered quasi-cyclic (QC) LDPC decoders, based on the Self-Correcting Min-Sum (SCMS) algorithm. We address the problem of high memory overhead required by layered SCMS based decoders compared to conventional Min-Sum (MS), by proposing two improvements. These require changes in the flow/rule of the conventional SCMS algorithm, in order to...
This paper proposes a FPGA implementation based on sliding processing window for Harris corner algorithm. It represents one of the most frequently used pre-processing method, for a wide variety of image processing algorithms, such as feature detection, motion tracking, image registration, etc‥ It relies on a series of sequential steps, each processing an image outputted by the previous step. The purpose...
This paper proposes an FPGA based layered architecture for quasi-cyclic (QC) irregular LDPC decoder. Our approach is based on merging variable and check node processing into one single variable-check node (VCN) unit. Layer message computation is done using a parallel scheme of a number of VCNs equal to the expansion factor of the QC matrix. The proposed architecture is characterized by the serial...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.