Search results

Items from 1 to 20 out of 206 results

article

Architecture and Synthesis for Area-Efficient Pipelining of Irregular Loop Nests

Gai Liu, Mingxing Tan, Steve Dai, Ritchie Zhao, more

IEEE Transactions on Computer-Aided Design of Integrated Circuits and... > 2017 > 36 > 11 > 1817 - 1830

Modern high-level synthesis (HLS) tools commonly employ pipelining to achieve efficient loop acceleration by overlapping the execution of successive loop iterations. While existing HLS pipelining techniques obtain good performance with low complexity for regular loop nests, they provide inadequate support for effectively synthesizing irregular loop nests. For loop nests with dynamic-bound inner loops,...

chapter

Memory compact high-speed QC-LDPC decoder

Tianjiao Xie, Bo Li, Mao Yang, Zhongjiang Yan

2017 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC) > 1 - 5

2017 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC)

In this paper, compact memory strategies for partially parallel Quasi-cyclic LDPC (QC-LDPC) decoder architecture are proposed. By compacting several adjacent rows hard decisions and extrinsic messages into one memory entry, which not only reduces the number of memory banks for hard decisions, but also facilitates multiple data accesses per clock cycle, the throughput of the decoder is increased. We...

chapter

A generic high throughput architecture for stream processing

Christes Rousopoulos, Ektoras Karandeinos, Grigorios Chrysos, Apostolos Dollas, more

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 5

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Stream join is a fundamental and computationally expensive data mining operation for relating information from different data streams. This paper presents two FPGA-based architectures that accelerate stream join processing. The proposed hardware-based systems were implemented on a multi-FPGA hybrid system with high memory bandwidth. The experimental evaluation shows that our proposed systems can outperform...

chapter

Hardware-oriented turbo-product codes decoder architecture

Yaroslav Krainyk, Vladyslav Perov, Maksym Musiyenko, Yevhen Davydenko

2017 9th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS) > 1 > 151 - 154

2017 9th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS)

Model of Turbo-Product Codes decoder architecture and method for construction of Turbo-Product Codes decoder are proposed in the paper. The model describes decoder functioning taking into account limitations of hardware platform and proposes re-use of components in the decoding process. The method provides set of steps for decoder implementation. Field-Programmable Gate Arrays circuits are selected...

chapter

Proposition and evaluation of a real-time generic architecture for a laser stripe detection system on FPGA

Seher Colak, Emmanuel Dumas, Virginie Fresse, Olivier Alata

2017 Conference on Design and Architectures for Signal and Image Processing (DASIP) > 1 - 6

2017 Conference on Design and Architectures for Signal and Image Processing (DASIP)

Laser triangulation applications are commonly used for industrial quality control. Such algorithms require real-time systems often made of a computing unit close to the image sensor through a short and fast link. Choosing a camera with integrated Field Programmable Gate Array (FPGA) as the computing unit can provide high pipeline and parallel computing adapted to process image in real-time. Moreover,...

chapter

FPGA-based frequent items counting using matrix of equality comparators

Trong-Thuc Hoang, Xuan-Thuan Nguyen, Hong-Thu Nguyen, Nhu-Quynh Truong, more

2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS) > 285 - 288

2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS)

In this paper, an FPGA-based implementation of Frequent Items Counting is proposed. The architecture deploys the equality comparator matrix for comparing the input items with themselves to count them instantly within a single operating clock. The proposed architecture is applied to the case of the 8-bit item. That means 256 different types of items in total. The system is built and verified on the...

chapter

A reconfigurable high-speed spiral FIR filter architecture

Shalina Percy Delicia Figuli, Peter Figuli, Jurgen Becker

2017 40th International Conference on Telecommunications and Signal Processing (TSP) > 532 - 537

2017 40th International Conference on Telecommunications and Signal Processing (TSP)

The need for efficient Finite Impulse Response (FIR) filters in high-speed applications targets Field Programmable Gate Arrays (FPGAs) as an effective and flexible platform for digital implementation. Although FIR filter offer advantages like linear phase characteristic, no feedback loops and good system stability, its convolution nature poises a challenge in parallelization due to data dependency...

chapter

Maximizing the throughput of threshold-protected AES-GCM implementations on FPGA

Jo Vliegen, Oscar Reparaz, Nele Mentens

2017 IEEE 2nd International Verification and Security Workshop (IVSW) > 140 - 145

2017 IEEE 2nd International Verification and Security Workshop (IVSW)

In this paper, we push the limits in maximizing the throughput of side-channel-protected AES-GCM implementations on an FPGA. We present a fully unrolled and pipelined architecture that uses a Boolean masking countermeasure (specifically, threshold implementation) for first-order DPA resistance. Using a high-end Virtex-7 device, we obtain a throughput of 15.24 Gbit/s. Since masked implementations require...

chapter

Hardware design and analysis of efficient loop coarsening and border handling for image processing

M. Akif Ozkan, Oliver Reiche, Frank Hannig, Jurgen Teich

2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP) > 155 - 163

2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP)

Field Programmable Gate Arrays (FPGAs) excel at the implementation of local operators in terms of throughput per energy since the off-chip communication can be reduced with an application-specific on-chip memory configuration. Furthermore, data-level parallelism can efficiently be exploited through socalled loop coarsening, which processes multiple horizontal pixels simultaneously. Moreover, existing...

chapter

FPGA systolic array GZIP compressor

Ovidiu Plugariu, Alexandru Dumitru Gegiu, Lucian Petrica

2017 9th International Conference on Electronics, Computers and Artificial Intelligence (ECAI) > 1 - 6

2017 9th International Conference on Electronics, Computers and Artificial Intelligence (ECAI)

In this paper we present a complete, open-source GZIP compressor implementation for FPGA based on a systolic array architecture. GZIP is one of the most utilized compression algorithms. Besides the usual use-case of compression for data storage, distributed computing systems such as Hadoop utilize compression to reduce the amount of data which is transferred between computing nodes in a cluster. However,...

chapter

Highly parallel bitmap-based regular expression matching for text analytics

Xuan-Thuan Nguyen, Hong-Thu Nguyen, Katsumi Inoue, Osamu Shimojo, more

2017 IEEE International Symposium on Circuits and Systems (ISCAS) > 1 - 4

2017 IEEE International Symposium on Circuits and Systems (ISCAS)

Text analytics has become increasingly important in the past few years because of the substantial growth in the amount of research, business, and government needs. An efficient text analytics system is likely to require high-powered regular expression matching (REGEX), as REGEX operations dominate the whole execution time. Some approaches have exploited the parallelism of graphic processing units...

chapter

Low-latency hardware architecture for cipher-based message authentication code

Imed Ben Dhaou, Tuan Nguyen Gia, Pasi Liljeberg, Hannu Tenhunen

2017 IEEE International Symposium on Circuits and Systems (ISCAS) > 1 - 4

2017 IEEE International Symposium on Circuits and Systems (ISCAS)

Cipher-based message authentication code, CMAC, is a NIST approved standard for checking message integrity and authentication. This work presents a low-latency AES architecture for CMAC. The architecture uses intensive parallel processing per round and takes advantage of the BRAM present in modern FPGA. Experimental results show that for typical IoT application, the proposed architecture has a latency...

chapter

A Scalable FPGA-Based Accelerator for High-Throughput MCMC Algorithms

Morteza Hosseini, Rashidul Islam, Amey Kulkarni, Tinoosh Mohsenin

2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) > 201

2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)

Markov Chain Monte Carlo (MCMC) algorithms are used to obtain samples from any target probability distribution and are widely used in stochastic processing techniques. Stochastic processing techniques such as machine learning and image processing need to compute large amounts of data in real-time, thus high throughput MCMC samplers are of utmost importance. Parallel Tempering (PT) MCMC has proven...

chapter

Hierarchical temporal memory implementation on FPGA using LFSR based spatial pooler address space generator

Madis Kerner, Kalle Tammemae

2017 IEEE 20th International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS) > 92 - 95

2017 IEEE 20th International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS)

Hierarchical temporal memory (HTM) is the model of the neocortex functionality, developed by Numenta, Inc. The level of implementation does cover only the subset of actual neocortex layers functionality, but, however, is sufficient to be useful in different domain areas e.g. for a novelty or anomaly detection. Numenta provides their implementation of the HTM for commercial or research purposes as...

chapter

Multiple parallel branch with folding architecture for multichannel filtered-x least mean square algorithm

Dongyuan Shi, Jianjun He, Chuang Shi, Tatsuya Murao, more

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 1188 - 1192

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Multichannel active noise control (MCANC) systems are commonly used in acoustic noise or vibration control, such as large-dimension ventilation ducts, open windows and mechanical structures. However, its computational load far exceeds the capabilities of digital signal processors (DSPs) and microcontrollers. Even the field programmable gate array (FPGA) cannot straightforwardly cope with the exponential...

chapter

An advanced embedded architecture for connected component analysis in industrial applications

Menbere Tekleyohannes, Mohammadsadegh Sadri, Christian Weis, Norbert Wehn, more

Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017 > 734 - 735

2017 Design, Automation & Test in Europe Conference & Exhibition (DATE)

In recent years, connected component analysis (CCA) has become one of the vital image/video processing algorithms due to its wide-range applicability in the field of computer vision. Numerous applications such as pattern recognition, object detection and image segmentation involve connected component analysis. In the context of camera-based inspection systems, CCA plays an important role for quality...

chapter

Comparison of parallel image scanning methods for achieving better throughput

Mohammad Rafi, Najeeb-ud-Din

2017 4th International Conference on Signal Processing and Integrated Networks (SPIN) > 100 - 103

2017 4th International Conference on Signal Processing and Integrated Networks (SPIN)

Higher throughput is always desired in real time image processing applications. There are many ways to achieve higher throughput. However, if we have additional resources and memory bandwidth available, parallelism can be applied to achieve it. In this work, we have presented two image scanning methods that carry out parallelism to double the throughput of any architecture. Partitioned image scanning...

chapter

A heterogeneous SDR MPSoC in 28nmCMOS for low-latency wireless applications

Sebastian Haas, Tobias Seifert, Benedikt Nothen, Stefan Scholze, more

2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC) > 1 - 6

2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC)

Current and future applications impose high demands on software-defined radio (SDR) platforms in terms of latency, reliability, and flexibility. This paper presents a heterogeneous SDR MPSoC with a hexagonal network-on-chip to address these issues. It features four data processing modules and a baseband processing engine for iterative multiple-input multiple-output (MIMO) receiving. Integrated memory...

chapter

A novel multiprocessor architecture for k-means clustering algorithm based on network-on-chip

Sajid Gul Khawaja, Muhammad Usman Akram, Shoab A. Khan, Ammar Ajmal

2016 19th International Multi-Topic Conference (INMIC) > 1 - 5

2016 19th International Multi-Topic Conference (INMIC)

The k-means clustering is one of the widely used algorithms in Data Mining and Machine Learning domains due to the simplicity, efficiency and scalability involved. The algorithm allocates N data-points or samples to k-clusters employing the minimum distances from respective cluster centroids. Distance calculation is intrinsically a computationally intensive task which is usually accelerated by using...

chapter

LOS Throughput Measurements in Real-Time with a 128-Antenna Massive MIMO Testbed

Paul Harris, Siming Zhang, Mark Beach, Evangelos Mellios, more

2016 IEEE Global Communications Conference (GLOBECOM) > 1 - 7

GLOBECOM 2016 - 2016 IEEE Global Communications Conference

This paper presents initial results for a novel 128-antenna massive Multiple-Input, Multiple- Output (MIMO) testbed developed through Bristol Is Open in collaboration with National Instruments and Lund University. We believe that the results presented here validate the adoption of massive MIMO as a key enabling technology for 5G and pave the way for further pragmatic research by the massive MIMO community...

Data set:
ieee
Keywords:
THROUGHPUT
FIELD PROGRAMMABLE GATE ARRAYS
COMPUTER ARCHITECTURE

Publication date

Set your own date range

Content availability

Available (204)
None (2)

Publication type

book (186)
article (20)

Keywords

HARDWARE (84)
FPGA (67)
CLOCKS (47)
ALGORITHM DESIGN AND ANALYSIS (30)
PIPELINES (25)
CRYPTOGRAPHY (24)
RANDOM ACCESS MEMORY (22)
REGISTERS (21)
ENCRYPTION (19)
DECODING (17)
PIPELINE PROCESSING (17)
DELAY (14)
DIGITAL SIGNAL PROCESSING (13)
PARALLEL PROCESSING (13)
PROGRAM PROCESSORS (13)
MIMO (12)
PARALLEL ARCHITECTURES (12)
ENCODING (11)
FIELD PROGRAMMABLE GATE ARRAY (11)
RECONFIGURABLE ARCHITECTURES (10)
APPLICATION SPECIFIC INTEGRATED CIRCUITS (9)
FPGA IMPLEMENTATION (9)
LOGIC GATES (9)
OPTIMIZATION (9)
STANDARDS (9)
VLSI (9)
ADVANCED ENCRYPTION STANDARD (8)
GENERATORS (8)
KERNEL (8)
LOGIC DESIGN (8)
POLYNOMIALS (8)
SECURITY (8)
TRANSFORMS (8)
ASIC (7)
DISCRETE FOURIER TRANSFORMS (7)
FAST FOURIER TRANSFORMS (7)
FIELD PROGRAMMABLE GATE ARRAY (FPGA) (7)
PARITY CHECK CODES (7)
POWER DEMAND (7)
PROTOCOLS (7)
SIGNAL PROCESSING ALGORITHMS (7)
ADDERS (6)
CIPHERS (6)
ENGINES (6)
FFT (6)
IMAGE CODING (6)
MATRIX DECOMPOSITION (6)
MICROPROCESSOR CHIPS (6)
RADIATION DETECTORS (6)
SHA-3 (6)
SIGNAL PROCESSING (6)
SOFTWARE (6)
ACCELERATION (5)
ADVANCED ENCRYPTION STANDARD (AES) (5)
AES (5)
BENCHMARK TESTING (5)
BLOCK CODES (5)
COMPLEXITY THEORY (5)
HARDWARE DESCRIPTION LANGUAGES (5)
IP NETWORKS (5)
MICROPROCESSORS (5)
NIST (5)
PAYLOADS (5)
PIPELINED ARCHITECTURE (5)
SIZE 65 NM (5)
TRANSFORM CODING (5)
VIDEO CODING (5)
BANDWIDTH (4)
COMPUTATIONAL COMPLEXITY (4)
COMPUTERS (4)
DISCRETE WAVELET TRANSFORMS (4)
FAST FOURIER TRANSFORM (4)
FPGA DEVICES (4)
HARDWARE DESIGN LANGUAGES (4)
IMAGE PROCESSING (4)
INDEXES (4)
JPEG2000 (4)
LDPC (4)
MAGNETIC CORES (4)
MIMO COMMUNICATION (4)
MULTIPLEXING (4)
NETWORK-ON-CHIP (4)
PACKET CLASSIFICATION (4)
PIPELINING (4)
QUANTIZATION (4)
REAL TIME SYSTEMS (4)
SHIFT REGISTERS (4)
SOFTWARE ALGORITHMS (4)
SWITCHES (4)
VHDL (4)
VLSI ARCHITECTURE (4)
ARITHMETIC CODES (3)
ARRAYS (3)
AWGN (3)
BLAKE (3)
CLASSIFICATION ALGORITHMS (3)
CMOS INTEGRATED CIRCUITS (3)
more

INFONA - science communication portal

Search results

Architecture and Synthesis for Area-Efficient Pipelining of Irregular Loop Nests

Memory compact high-speed QC-LDPC decoder

A generic high throughput architecture for stream processing

Hardware-oriented turbo-product codes decoder architecture

Proposition and evaluation of a real-time generic architecture for a laser stripe detection system on FPGA

FPGA-based frequent items counting using matrix of equality comparators

A reconfigurable high-speed spiral FIR filter architecture

Maximizing the throughput of threshold-protected AES-GCM implementations on FPGA

Hardware design and analysis of efficient loop coarsening and border handling for image processing

FPGA systolic array GZIP compressor

Highly parallel bitmap-based regular expression matching for text analytics

Low-latency hardware architecture for cipher-based message authentication code

A Scalable FPGA-Based Accelerator for High-Throughput MCMC Algorithms

Hierarchical temporal memory implementation on FPGA using LFSR based spatial pooler address space generator

Multiple parallel branch with folding architecture for multichannel filtered-x least mean square algorithm

An advanced embedded architecture for connected component analysis in industrial applications

Comparison of parallel image scanning methods for achieving better throughput

A heterogeneous SDR MPSoC in 28nmCMOS for low-latency wireless applications

A novel multiprocessor architecture for k-means clustering algorithm based on network-on-chip

LOS Throughput Measurements in Real-Time with a 128-Antenna Massive MIMO Testbed

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options