The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The combination of multiple-input multiple-output (MIMO) and orthogonal frequency-division multiplexing (OFDM) is a promising solution for high-data-rate transmission. An architecture of a K-best based list sphere detector (LSD) algorithm for a MIMO-OFDM system is introduced in this paper. The architecture was designed for a 2 × 2 antenna system with quadrature phase shift keying (QPSK) and 16-quadrature...
Memory-based architectures have received great attention for single-chip implementation of the fast Fourier transform (FFT). Basically, they can be roughly categorized as single-memory design, dual-memory design, and buffer-memory design. Among them, the buffer-memory design can balance the trade-off between memory size and control circuit complexity. In this paper, we present a design methodology...
This paper presents hardware implementations for Improved Wired Equivalent Privacy (IWEP) and RC4 ("Ron's Cipher #4") encryption algorithms. IWEP is a block algorithm providing light-strength encryption. The algorithm has been designed for a new Wireless Local Area Network (WLAN), called TUTWLAN (Tampere University of Technology Wireless Local Area Network). On the contrary RC4, developed...
In this paper, a new design for dynamic key based stream cipher is proposed for the hardware cryptographic applications such as data transmission and information security. Unlike the static key based existing stream ciphers, the novelty of this proposed stream cipher is based on dynamic key, generated by Toeplitz hash function which is used as a key for RC4 stream cipher. Further, this key is used...
Scale-out workloads are applications that are typically executed in a cloud environment and exhibit high level of request level parallelism. Such workloads benefit from processor organizations with very high core count since multiple requests can be serviced simultaneously by threads running on these cores. The characteristics of these workloads indicate that they have high instruction footprints...
Modern GPUs provide massive processing power (arithmetic throughput) as well as memory throughput. Presently, while it appears to be well understood how performance can be improved by increasing throughput, it is less clear what the effects of micro-architectural latencies are on the performance of throughput-oriented GPU architectures. In fact, little is publicly known about the values, behavior,...
This work proposes a MIL-STD-1553B remote terminal controller: RT-MIL-STD-1553+, which processes data rates of 100-Mb/s over 1553 buses. This redesigned controller has three major architectural enhancements over the current 1-Mb/s controllers. Firstly, it incorporates a synchronous back-end and host processor interface to a true dual port memory to enable faster memory accesses. Secondly, the controller...
Accelerated advances in automotive technology, such as sophisticated real-time engine controls for higher fuel efficiency and advanced driver-assistance systems (ADAS), are expanding the application range of Flash MCUs, microcontrollers with embedded Flash memory (eFlash). In addition to consistent demands for faster random access, shorter rewrite time and larger memory capacity in eFlash, there are...
As technology scales down toward deep submicron, large numbers of IP blocks are being integrated on the same Silicon die, thereby enabling large amount of parallel computations, such as those required for multimedia workloads. Network-on-chip (NOC) serves as an important agent to eliminate the communication bottleneck of future multicore systems. Arbiter, a prime component has a great impact on the...
Lightweight cryptography provides cryptographic algorithms for resource constrained devices and typically aims for low-cost ASIC applications like RFID tags. In addition, it also provides attractive performance — security trade-offs for FPGAs in scenarios with strict area constraints. This work presents FPGA implementations of the popular lightweight hash functions KECCAK-200 and KECCAK-400, PHOTON...
An efficient VLSI implementation of encryption using Advanced Encryption Standard (AES) algorithm is introduced. The architecture deals with ROM based key expansion modules rather than registers which were commonly used and another advantage is the exclusion of shift rows by which merging of two steps in algorithm is proposed which enhances the reduction in area and power. Xilinx ISE 14.5 is the software...
Packet classification is a network kernel function that has been widely researched over the past decade. However, most previous work has only focused on achieving high-throughput without considering its energy-efficiency implications. With the rapid growth of Internet, energy-efficiency has become an important metric for networks. We present the design of an energy-efficient packet classifier on Field-Programmable...
Multipliers are considered to be an important component in DSP applications like filters. Designing high-speed multipliers with low power have substantial research interest. Modified Booth Multiprecision Multiplier (MBMP) reduces the power consumption by selecting the small precision multipliers in accordance with the selection of input operands selector. The large area overhead can be reduced by...
Dynamic Circuit Specialization (DCS) is an optimization technique used for implementing a parameterized application on an FPGA. The application is said to be parameterized when some of its inputs, called parameters, are infrequently changing compared to the other inputs. Instead of implementing these parameter inputs as regular inputs, in the DCS approach these inputs are implemented as constants...
This paper presents design and implementation of a high throughput interpolator for the fractional motion estimation in HEVC systems. Novel data reusing scheme and highly parallel architecture are proposed such that timing efficiency and thus processing throughput of the system are enhanced. The detailed circuit architecture and timing analysis for the proposed interpolator will be given. Moreover,...
This paper presents an arbitration mechanism to balance bandwidth consumption or data throughputs between packets in a network-on-chip (NoC) with ID-based wormhole cut-through switching method. When data traffic flowing through a network communication link is high, the bandwidth space of the link, which is comsumed by a message or a data stream, could be affected by the distance between the source...
With technology scaling, process, voltage, and temperature (PVT) variations pose great challenges on integrated circuit designs. Conventionally, LSI circuits are designed by adding pessimistic timing margin to guarantee "always correct" operations even under worst-case conditions. However, due to the increasing PVT variations, unacceptable larger design guard band should be reserved to avoid...
We propose an Asynchronous-to-Synchronous Interface Controller (A2S-IC) with low delay-variation towards Process, Voltage and Temperature (PVT) variations for sub-threshold/near-threshold operation in low power applications. This A2S-IC is targeted for a full-range Dynamic Voltage Scaling (DVS) Global-Asynchronous-Local-Synchronous (GALS) Network-on-Chip (NoC). There are three key attributes in this...
This paper demonstrates a clockless stochastic low-density parity-check (LDPC) decoder implemented on a Field-Programmable Gate Array (FPGA). Stochastic computing reduces the wiring complexity necessary for decoding by replacing operations such as multiplication and division with simple logic gates and serial processing. Clockless decoding increases the throughput of the decoder by eliminating the...
The integration of a variety of IP cores into a single chip to meet the high demand of new applications leads to many challenges in timing issues, especially the interface between different clock domains. Globally Asynchronous, Locally Synchronous (GALS) approach addresses these challenges by dividing a chip into several independent subsystems working with different clock signals. In multi-synchronous...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.