The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The speedup is usually limited by two main laws in high-performance computing, that is, the Amdahl's and Gustafson's laws. However, the speedup sometimes can reach far beyond the limited linear speedup, known as superlinear speedup, which means that the speedup is greater than the number of processors that are used. Although the superlinear speedup is not a new concept and many authors have already...
In this paper we propose an efficient hardware architecture for computation of matrix inversion of positive definite matrices. The algorithm chosen is LDL decomposition followed directly by equation system solving using back substitution. The architecture combines a high throughput with an efficient utilization of its hardware units. We also report FPGA implementation results that show that the architecture...
We propose a new timing error correction scheme for area-efficient design of flip-flop based pipeline. Key features in the proposed scheme are 1) one-cycle error correction using a new local stalling scheme and 2) selective replacement of the error detection and correction flip-flops in critical paths only. A 32-bit MIPS testchip in a 65 nm CMOS technology has been implemented as a testbed. By employing...
Energy consumption's increasing importance in scientific computing has driven an interest in developing energy efficient high performance systems. Energy constraints of mobile computing has motivated the design and evolution of low-power computing systems capable of supporting a variety of compute-intensive user interfaces and applications. Others have observed the evolution of mobile devices to also...
In modern communication networks, the security aspect is very important. Encryption algorithms are used to protect user communication from eavesdropping. Symmetric key algorithms must be used to achieve high speed secured communication. In this paper, we propose and evaluate the pipelined implementation of the Camellia encryption algorithm which has been approved for use by the ISO/IEC. Camellia algorithm...
The article presents 12.5 Gbit/s Physical Media Attachment (PMA) units, TX and RX, fabricated in 90 nm bulk CMOS process. The PMA are designed for use in SpaceFibre/GigaSpaceWire (SpaceWire-RUS) systems for the space radars. The units comprise SERDES and clock and data recovery (CDR). Supported set of data rates includes those of 1.25, 2.5, 6.25 and 12.5 Gbit/s, but intermediate rates are also available.
The image processing applications require low power and high speed, the convolution based 1D-DWT is not desirable. In this proposed architecture the modified 5/3 lifting algorithm is realized on FPGA platform with optimizations. The latency and throughput is optimized with the modified algorithm. The architecture is modelled using HDL and implemented on FPGA. The proposal operates at 178MHz and realised...
Polar Codes become a new channel coding, which will be common to apply for next-generation wireless MIMO communication systems. In this work, we propose LEGO-based VLSI hardware design and implementation of the Polar encoder using radix-2 processing engines, which features low area cost, low power dissipation, high speed, and high throughput via serial computation. Under TSMC 90nm CMOS technology,...
As the flash memory continues its capacity scaling and correspondingly decreases its reliability, a technology upgrade regarding the error-correction engine in state-of-art solid-state drives (SSDs) is intensely expected. Due to their limit-approaching decoding ability, low-density parity-check (LDPC) codes are seen as one of the most promising substitute for the traditional BCH codes, though implementation...
This paper proposes a pipelined time stretching technique for high throughput counter-based time-to-digital converters (TDC). Time stretching technique is used to increase the resolution of counter-based TDCs, yet it carries an inherent weakness of having a long conversion time due to the stretching phase. Without significant increment of chip area, the proposed pipelined time stretching method is...
Polar Codes applied for next-generation MIMO systems is an emerging research topic. In this work, we propose an efficient VLSI hardware architecture of the Polar encoder using radix-k processing engines. Under TSMC 90nm CMOS technology, the 16384-point radix-2 based Polar encoder design is synthesized with 0.244mm2 under maximum clock frequency of 2.0GHz. In the similar manner, the VLSI hardware can...
Polar codes have recently become increasingly popular due to their simple structure and low decoding complexity. However, polar codes are still not suitable for real-time applications because of the long decoding latency. In this paper, by analysis of the conventional architecture of SC decoder, a low latency SC decoder architecture is proposed. Using the proposed architecture, the decoding latency...
High-speed serial data communication is now very popular for connecting various resources in high-performance computing systems. In such high-speed serial links, a line coding is important to control the run length (RL) and the running disparity (RD), because a large run length causes insufficient transitions on data-links that make it difficult to perform reliable clock and data recovery (CDR), and...
The Network Function Virtualization (NFV) paradigm promises to make networks more scalable and flexible by decoupling the network functions (NFs) from dedicated and vendor-specific hardware. However, network and compute intensive NFs may be difficult to virtualize without performance degradation. In this context, Field-Programmable Gate Arrays (FPGAs) have been shown to be a good option for hardware...
We consider the design of a shared global on-chip communication medium using repeated equalized transmission lines (RETLs). Our design overcomes a number of limitations with previously proposed shared global mediums based on transmission lines. Prior solutions require wide-pitch transmission lines that occupy considerable area, do not support multicast or broadcast operations, and employ centralized...
Key-value stores (KVS) become critical in many applications because of the data explosion recently. There is a strong demand to improve the throughput and reduce the latency for KVS. FPGA-based parallel architecture can bring excellent performance and power efficiency. Cuckoo hashing has proven to be an efficient approach to implement KVS with good memory utilization and constant worst case access...
Sharing multi-cycle hardware blocks like the DSP48E1 primitive in Xilinx FPGAs can result in significant resource savings, but complicates scheduling. For high-throughput, DSP blocks must be pipelined, which results in a high initiation interval (II) for resource shared implementations. In this paper, we propose a resource reduction technique that minimises DSP block usage while also offering improved...
We explore the possibility of using shift register lookup tables (SRLs) for the implementation of Keccak on Xilinx FPGAs. The approach originates from the observation that the ρ step in combination with the state storage can be implemented as a collection of shift registers. This way, we achieve a slice-wise implementation using 25 shift registers of various lengths, resulting in 75 32-bit and 6 16-bit...
The potential of exploiting large propagation delays in underwater acoustic (UWA) networks to maximize the network throughput is established in the recent past. Transmission scheduling strategies have been proposed to take advantage of large propagation delay. Super-TDMA is one among such Medium Access Control (MAC) strategies proposed. It is a form of Time Division Multiple Access (TDMA) protocol...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.