The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, compact memory strategies for partially parallel Quasi-cyclic LDPC (QC-LDPC) decoder architecture are proposed. By compacting several adjacent rows hard decisions and extrinsic messages into one memory entry, which not only reduces the number of memory banks for hard decisions, but also facilitates multiple data accesses per clock cycle, the throughput of the decoder is increased. We...
In this paper, an FPGA-based implementation of Frequent Items Counting is proposed. The architecture deploys the equality comparator matrix for comparing the input items with themselves to count them instantly within a single operating clock. The proposed architecture is applied to the case of the 8-bit item. That means 256 different types of items in total. The system is built and verified on the...
Hierarchical temporal memory (HTM) is the model of the neocortex functionality, developed by Numenta, Inc. The level of implementation does cover only the subset of actual neocortex layers functionality, but, however, is sufficient to be useful in different domain areas e.g. for a novelty or anomaly detection. Numenta provides their implementation of the HTM for commercial or research purposes as...
In this paper we propose an efficient hardware architecture for computation of matrix inversion of positive definite matrices. The algorithm chosen is LDL decomposition followed directly by equation system solving using back substitution. The architecture combines a high throughput with an efficient utilization of its hardware units. We also report FPGA implementation results that show that the architecture...
The image processing applications require low power and high speed, the convolution based 1D-DWT is not desirable. In this proposed architecture the modified 5/3 lifting algorithm is realized on FPGA platform with optimizations. The latency and throughput is optimized with the modified algorithm. The architecture is modelled using HDL and implemented on FPGA. The proposal operates at 178MHz and realised...
We explore the possibility of using shift register lookup tables (SRLs) for the implementation of Keccak on Xilinx FPGAs. The approach originates from the observation that the ρ step in combination with the state storage can be implemented as a collection of shift registers. This way, we achieve a slice-wise implementation using 25 shift registers of various lengths, resulting in 75 32-bit and 6 16-bit...
This paper describes the architecture and implementation of a high performance QR decomposition IEEE754 single precision floating point core, using a modified Gram-Schmidt algorithm. Using Intel's new floating point Arria 10 FPGAs, synthesis is used to generate column high functional units, giving O(n2) processing times. The modified Gram-Schmidt algorithm is expressed in a different order to combine...
This paper describes the implementation of a high throughput FFTs implemented on FPGAs, using a modified version of the Radix 2N architecture. The implementation uses a synthesis method which supports “super-sampling” to provide very high throughput. Special vector structures in the tools and hardware architecture are supported where complex vectors form the input on each clock cycle, and multiple...
The design of pipelined Fast Fourier transform (PFFT) in modern communication systems provides an efficient way for computation of FFT with better area utilizing hardware architecture. Previously, the radix-22 had been used only for single path delay feedback architectures. Later with many types of research works the radix 22 was extended to multi-path delay commutator (MDC) architectures. This paper...
In this paper, we propose a high-throughput pipeline architecture of the stream cipher ZUC which has been included in the security portfolio of 3GPP LTE-Advanced. In the literature, the schema with the highest throughput only implements the working stage of ZUC. The schemas which implement ZUC completely can only achieve a much lower throughput, since a self-feedback loop in the critical path significantly...
RF antenna array beamforming based on electronically steerable wideband phased-array apertures find applications in communications, radar, imaging and microwave sensing. High-bandwidth requirements for wideband RF applications necessitate hundreds of MHz or GHz frame-rates for the digital array processor. A systolic architecture is proposed for the real-time implementation of the 2-D IIR beam filter...
In the modern world of digitization, processing of data in real time requires an increase in the operating speed of a system. The processing more often than not utilizes multiplication which is time consuming and introduces considerable amount of delay. As such, there is a need to reduce this delay and achieve faster real time processing of data. This paper proposes a novel architecture for implementation...
Cuckoo hashing has proven to be an efficient option to implement exact matching in networking applications. It provides good memory utilization and deterministic worst case access time. The continuous increase in speed and complexity of networking devices creates a need for higher throughput exact matching in many applications. In this paper, a new Cuckoo hashing implementation named parallel d-pipeline...
The Advanced Encryption Standard (AES) together with the Galois Counter Mode (GCM) of operation has been approved for use in several high throughput network protocols to provide authenticated encryption. However, the demand for continued increase in network bandwidth has not abated and we anticipate the need for continual performance improvement of AES-GCM in hardware. Additionally, as data interfaces...
CRC (Cyclic Redundancy Check) is a simple and an elegant method for error detection. It finds application in most of the high-speed data communication protocol. In High Energy Physics experiment often CRC is used for control and data frame communication with detectors placed at radiation zone. Reliability of CRC error detection capability alters with generator polynomial chosen. The most popular choice...
Securing data in machine to machine scenario is of paramount importance. With the evolving of Internet of Things (IoT) market, data should be protected from eavesdropper. Of the available cryptographic algorithms, International Data Encryption Algorithm (IDEA) is one of the most secured and highly accepted algorithm to a broad range of applications. This paper presents implementation of IDEA cryptographic...
An efficient compact implementation of the 128-bit SEED block cipher is presented in this paper. The proposed architecture achieves low level in hardware resources, so it is efficient for area constraints applications such as smart cards. The proposed implementation reaches a data throughput of 29.7 Mbps at 111 MHz clock frequency. The design was coded using VHDL language and for the hardware implementation,...
This paper presents radix-4 and radix-8 Booth encoded modular multipliers over general Fp based on inter-leaved multiplication algorithm. An existing bit serial interleaved multiplication algorithm is modified using radix-4, radix-8 and Booth recoding techniques. The modified radix-4 and radix-8 versions of interleaved multiplication result in 50% and 75% reduction in required number of clock cycles...
Frequency table computation is a key step in decision tree learning algorithms. In this paper we present a novel implementation targeted for dataflow architecture implemented on field programmable gate array (FPGA). Consistent with dataflow model of computation, the kernel views input dataset as synchronous streams of attributes and class values. The kernel was benchmarked using key functions from...
This paper presents a high speed configurable FPGA architecture for k-means clustering. The proposed architecture is highly pipelined, parallel and fully configurable. It can achieve an operating frequency of 400 MHz, which is at least three times faster than prior works. The proposed architecture addresses the high speed and throughput requirements of machine vision, multi-media and data mining applications.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.