The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In the near future, embedded systems containing hundreds of processing elements running multiple concurrent applications will become a reality. The heterogeneous multicluster architecture enables to cope with the challenging hardware/software requirements presented by such systems. This paper shows principles and optimization of multicluster dimensioning aiming at an appropriate distribution of applications...
The Protein Processor Associative Memory (PPAM) is a novel hardware architecture for a distributed, decentralised bidirectional, hetero-associative memory, that can adapt online to changes in the training data. The PPAM is fundamentally different from traditional processing methods that tend to use arithmetic operations to perform computation. In this paper, we evaluate the fault tolerant properties...
In this paper we present the use of a SystemC-based design environment called DESYRE to the simulation of a modern elevator system designed by Otis Elevator Company for large scale buildings. We describe the construction of the virtual prototype of a scalable elevator system based on the CAN communication protocol. We show the tuning and validation of the simulated model against a test system composed...
As the size and complexity of embedded systems are growing, the area cost and performance of the LSI circuits are becoming more crucial. A critical bottleneck for them is interconnections such as multiplexers (MUXs). Thus, a hardware synthesis technique for reducing MUXs, especially during the earlier design phase, has been demanded. This paper presents a novel MUX reduction technique in high-level...
Minimizing circuit AC delay variations while maintaining power/performance is key for achieving high yielding products. The present work discusses an analytic model based approach for aligning the fundamental-FET electrical control and circuit-speed variability applied towards product screening. Such a model is proven to be effective in a manufacturing environment for predicting delay variation, and...
We Propose a 2K/4K/8K point FFT (Fast Fourier Transform) for OFDM (Orthogonal Frequency Division Multiplexing) of DVB-H (Digital Video Broadcast Terrestrial) Receiver. The proposed FFT architecture utilizes cascaded radix-4 single path feedback (SDF) structure based on the Radix-2/Radix-4 FFT algorithm.[11] We use block floating point scaling technique in order to increase SQNR. The 2K/8K FFT consists...
This article presents the design, implementation and performance evaluation of a hardware accelerator for matrix multiplication. The accelerator is loosely coupled with the host computer via common system bus. The accelerator is composed of linear processor array (LPA), distributed memory and dedicated address generator unit. Mathematical procedure for LPA synthesis is given. The speedup of the proposed...
Manufacturing process variations (PV) of transistors in the deep-submicron regime present the single biggest design challenge for large die size VLSI circuits such as processor arrays, GPUs, and FPGAs. However, there are a few applications in signal processing, such as image processing, and speech processing, where errors in computation by the underlying hardware could be tolerated or corrected off-line...
Implementing integer division in hardware is expensive when compared to multiplication. In the case where the divisor is a constant, expensive integer division algorithms can be replaced by cheaper integer multiplications and additions. This paper presents the conditions for multiply-add schemes to perform correctly rounded unsigned invariant integer division under one of three rounding modes. We...
In this paper, we propose an enhanced eight-parallel 128/256-point mixed-radix multi-path delay commutator (MRMDC) FFT/IFFT processor for high-speed orthogonal frequency-division multiplexing (OFDM) systems to reduce the number of complex multipliers. The proposed processor can achieve a high throughput rate by using an eight-parallel data-path scheme and an efficient scheduling scheme of complex...
SIFT (Scale Invariant Feature Transform) generates image features widely used to match objects in different images. Previous work on hardware-based SIFT implementation requires excessive internal memory and hardware logic [1]. In this paper, a new hardware organization is proposed to implement the keypoint detection in SIFT with a less memory and hardware cost than the previous work. To this end,...
Spectrum sensing, i.e. the identification of occupied frequencies within a large bandwidth, requires complex sampling hardware. Measurements suggest that only a small fraction of the available spectrum is actually used at any time and place, which allows a sparse characterization of the frequency domain signal. Compressed sensing (CS) can exploit this sparsity and simplify measurements. We investigate...
Montgomery modular multiplication is widely applied to public key cryptosystems like Rivest-Sharmir-Adleman (RSA) and elliptic curve cryptography (ECC). This work presents a word-based Booth encoded radix-4 Montgomery modular multiplication algorithm for low-latency scalable architecture. The data dependency resulting from the inherent right shifting of the intermediate results in the conventional...
Decimal arithmetic has gained high impact on the overall performance of today's financial and commercial applications. Decimal additions and multiplication are the main decimal operations used in any decimal arithmetic algorithm. Decimal digit adders and decimal digit multipliers are usually the building blocks for higher order decimal adders and multipliers. FPGAs provide an efficient hardware platform...
A typical high-speed decoder implementation for an LDPC may require hundreds or even thousands of variable and check node processors. Since check node processing unit (CNPU) is far more complex than variable processing unit, hardware requirements of CNPU has a big impact on the final decoder complexity. Here, an FPGA implementation of the soft parity check node for min-sum LDPC decoders is analyzed...
Mobile communication needs battery energy. There is always a trade-off between power consumption and bit-error performance. To investigate this, power modeling of the hardware components is necessary. In this paper, a baseband energy model for VLSI design optimization considering both dynamic and static energy consumption for wireless application with short distance is introduced. Due to the trade-off...
Rate adaptation is a family of technologies driven by the expectation that large energy savings can be achieved in packet networks by dynamically adjusting the capacity of network components to the load that they are required to sustain. In this paper we focus on packet-timescale rate adaptation (PTRA) techniques, which apply to individual traffic processing chips in the circuit packs of network systems...
In this paper, we have designed a Multi-Frequency Scaling scheme for energy conservation of network devices, especially routers and switches. The frequency of components in a network device is scaled dynamically according to the real time workload. A Markov model is developed for performance analysis of this mechanism. We implement a prototype of this scheme in the data path of a general IPv4 router...
The wormhole attack is a severe attack in Wireless Mesh Networks (WMNs). It involves two or more wormhole endpoints colluding to capture traffic from one place in the network and replay it to another faraway place through a secret tunnel, so as to distort network routing. It may lead to even more serious threats such as packet dropping and denial of service (DoS). Although a lot of works have been...
Recent proposals for determinism-enforcement architectures are able to honor the dependences between threads through a commit step that often becomes a performance bottleneck. As they commit code blocks (or chunks) in a round-robin order, if one chunk gets squashed due to a conflict, its successors also observe a stall. We call this effect transitive squash delay. This paper proposes a novel, high-performance...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.