Search results

Items from 1 to 20 out of 691 results

chapter

Maximizing CNN accelerator efficiency through resource partitioning

Yongming Shen, Michael Ferdman, Peter Milder

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) > 535 - 547

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA)

Convolutional neural networks (CNNs) are revolutionizing machine learning, but they present significant computational challenges. Recently, many FPGA-based accelerators have been proposed to improve the performance and efficiency of CNNs. Current approaches construct a single processor that computes the CNN layers one at a time; the processor is optimized to maximize the throughput at which the collection...

chapter

Architecting a multi-GHz real-time RF streaming system

Neil Feiereisel, Shivansh Chaudhary

2017 47th European Microwave Conference (EuMC) > 751 - 754

2017 47th European Microwave Conference (EuMC)

Advanced cellular and wireless standards are rapidly expanding their instantaneous RF bandwidth requirements, and with operating frequencies moving into the mmWave spectrum, channels of 1 GHz and wider become increasingly likely. Furthermore, carrier aggregation and MIMO systems require multiple wideband channels, placing even higher demands on the system. No longer satisfied with short bursts of...

chapter

Memory compact high-speed QC-LDPC decoder

Tianjiao Xie, Bo Li, Mao Yang, Zhongjiang Yan

2017 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC) > 1 - 5

2017 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC)

In this paper, compact memory strategies for partially parallel Quasi-cyclic LDPC (QC-LDPC) decoder architecture are proposed. By compacting several adjacent rows hard decisions and extrinsic messages into one memory entry, which not only reduces the number of memory banks for hard decisions, but also facilitates multiple data accesses per clock cycle, the throughput of the decoder is increased. We...

chapter

High-performance implementation of an HMAC processor based on SHA-3 Hash function

Junhui Li, Liji Wu, Xiangmin Zhang

2017 International Conference on Electron Devices and Solid-State Circuits (EDSSC) > 1 - 2

2017 International Conference on Electron Devices and Solid-State Circuits (EDSSC)

The Keyed-Hash Message Authentication Codes(HMAC) is a useful mechanism for message authentication. In this paper, a high-performance HMAC/SHA-3 processor which can generate HMAC message digest and hash message digest is presented. Not only the standard length (224,256,384,512) of the message digest can be generated, but also a length of 64-bit message digest. Due to the application of new generation...

chapter

High throughput AES encryption/decryption with efficient reordering and merging techniques

Lijuan Li, Shuguo Li

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

This paper proposes a high throughput architecture for AES encryption/decryption targeting on the recent FPGAs with 6-input LUTs. Unlike previous works which share multiplicative inverse logics to realize SubBytes and InvSubBytes, the proposed architecture directly employs the look-up-table based Sbox for both SubBytes and InvSubBytes. Efficient reordering and merging techniques are applied to achieve...

chapter

Mapping of P4 match action tables to FPGA

Michal Kekely, Jan Korenek

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 2

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Current networks are changing very fast. Network administrators need more flexible and powerful tools to be able to support new protocols or services very fast. The P4 language provides new level of abstraction for flexible packet processing. Therefore, we have designed new architecture for memory efficient mapping of P4 match/action tables to FPGA. The architecture is based on DCFL algorithm and...

chapter

Learning-based interconnect-aware dataflow accelerator optimization

Shuangnan Liu, Benjamin Carrion Schafer

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 7

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

The interconnect is the Achilles heel of FPGAs. It currently dominates the delay and leads to high power consumption. It is thus, imperative to take it into account when designing complex FPGA systems. In this work, we propose a learning-based method for data-flow systems build out of multiple individual components directly connected and find a set of optimal configurations with unique area vs. throughput...

chapter

A security library for FPGA interlays

Anuj Vaishnav, Jose Raul Garcia Ordaz, Dirk Koch

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Many CPU design houses have added dedicated support for cryptography in recent processor generations, including Intel, IBM, and ARM. While adding accelerators and/or dedicated instructions boosts performance on cryptography, we are investigating a different approach that is not adding extra silicon area: We study to replace the hardened NEON SIMD unit of an ARM Cortex-A9 with an identical sized FPGA...

chapter

A generic high throughput architecture for stream processing

Christes Rousopoulos, Ektoras Karandeinos, Grigorios Chrysos, Apostolos Dollas, more

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 5

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Stream join is a fundamental and computationally expensive data mining operation for relating information from different data streams. This paper presents two FPGA-based architectures that accelerate stream join processing. The proposed hardware-based systems were implemented on a multi-FPGA hybrid system with high memory bandwidth. The experimental evaluation shows that our proposed systems can outperform...

chapter

Scalable high-performance architecture for convolutional ternary neural networks on FPGA

Adrien Prost-Boucle, Alban Bourge, Frederic Petrot, Hande Alemdar, more

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 7

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Thanks to their excellent performances on typical artificial intelligence problems, deep neural networks have drawn a lot of interest lately. However, this comes at the cost of large computational needs and high power consumption. Benefiting from high precision at acceptable hardware cost on these difficult problems is a challenge. To address it, we advocate the use of ternary neural networks (TNN)...

chapter

A chaotic time-delay system based digital RNG and integrated autonomous test suite

Ramazan Yeniceri, Alptekin Vardar, Erdem Cil, Latif Akcay, more

2017 European Conference on Circuit Theory and Design (ECCTD) > 1 - 4

2017 European Conference on Circuit Theory and Design (ECCTD)

This paper presents a time-delay system which originally has chaotic behavior, yet lost that dynamic due to finite quantization levels of state variable representation. One method to overcome this destructive effect of digitalization is engaging a time-varying delay amount which is studied in this paper. Based on this system, random number generator (RNG) topologies are demonstrated with better throughput...

chapter

Hardware-oriented turbo-product codes decoder architecture

Yaroslav Krainyk, Vladyslav Perov, Maksym Musiyenko, Yevhen Davydenko

2017 9th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS) > 1 > 151 - 154

2017 9th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS)

Model of Turbo-Product Codes decoder architecture and method for construction of Turbo-Product Codes decoder are proposed in the paper. The model describes decoder functioning taking into account limitations of hardware platform and proposes re-use of components in the decoding process. The method provides set of steps for decoder implementation. Field-Programmable Gate Arrays circuits are selected...

chapter

Robust throughput boosting for low latency dynamic partial reconfiguration

A. Nannarelli, M. Re, G. C. Cardarilli, L. Di Nunzio, more

2017 30th IEEE International System-on-Chip Conference (SOCC) > 86 - 90

2017 30th IEEE International System-on-Chip Conference (SOCC)

Reducing the configuration time of portions of an FPGA at run time is crucial in contemporary FPGA-based accelerators. In this work, we propose a method to increase the throughput for FPGA dynamic partial reconfiguration by using standard IP blocks. The throughput is increased by over-clocking the configuration bitstream circuitry beyond the limits stated in the specifications of these standard blocks...

chapter

Proposition and evaluation of a real-time generic architecture for a laser stripe detection system on FPGA

Seher Colak, Emmanuel Dumas, Virginie Fresse, Olivier Alata

2017 Conference on Design and Architectures for Signal and Image Processing (DASIP) > 1 - 6

2017 Conference on Design and Architectures for Signal and Image Processing (DASIP)

Laser triangulation applications are commonly used for industrial quality control. Such algorithms require real-time systems often made of a computing unit close to the image sensor through a short and fast link. Choosing a camera with integrated Field Programmable Gate Array (FPGA) as the computing unit can provide high pipeline and parallel computing adapted to process image in real-time. Moreover,...

chapter

High-throughput FPGA implementation of the CCSDS 122.0-B-1 compression standard

Nikolaos Kefalas, George Theodoridis

2017 27th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS) > 1 - 8

2017 27th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)

A high-throughput architecture of the CCSDS 122.0-B-1 image compression standard is proposed. The architecture uses a novel memory organization in order to reduce the total memory operations and the number of the individual memories allowing operation without external memories. The architecture has been implemented on space grade and commercial FPGA Device. It achieves 136 MSamples/sec on space grade...

chapter

High level synthesis using vivado HLS for optimizations of SHA-3

H S. Jacinto, Luka Daoud, Nader Rafla

2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS) > 563 - 566

2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS)

Hash functions represent a fundamental building block of many network security protocols. The SHA-3 hashing algorithm is the most recently developed hash function, and the most secure. Implementation of the SHA-3 hashing algorithm in Hardware Description Language (HDL) is time demanding and tedious to debug. On the other hand, High-Level Synthesis (HLS) tools offer potential solutions to the hardware...

chapter

Packet Classification with Limited Memory Resources

Michal Kekely, Jan Korenek

2017 Euromicro Conference on Digital System Design (DSD) > 179 - 183

2017 Euromicro Conference on Digital System Design (DSD)

Network security and monitoring devices use packet classification to match packet header fields in a set of rules. Many hardware architectures have been designed to accelerate packet classification and achieve wire-speed throughput for 100 Gbps networks. The architectures are designed for high throughput even for the shortest packets. However, FPGA SoC and Intel Xeon with FPGA have limited resources...

chapter

FPGA-based frequent items counting using matrix of equality comparators

Trong-Thuc Hoang, Xuan-Thuan Nguyen, Hong-Thu Nguyen, Nhu-Quynh Truong, more

2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS) > 285 - 288

2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS)

In this paper, an FPGA-based implementation of Frequent Items Counting is proposed. The architecture deploys the equality comparator matrix for comparing the input items with themselves to count them instantly within a single operating clock. The proposed architecture is applied to the case of the 8-bit item. That means 256 different types of items in total. The system is built and verified on the...

chapter

A high-performance FPGA-based LDPC decoder for solid-state drives

Yanhuan Liu, Chun Zhang, Pengcheng Song, Hanjun Jiang

2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS) > 1232 - 1235

2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS)

In order to improve the throughput of error correction decoding for the high-performance solid-state drives (SSDs), a semi-parallel low-density parity-check (LDPC) decoding architecture is proposed in this paper. The circuit of the LDPC decoder which can be dynamically configured with bit rate and code length is implemented using the scheduling control flow mode of single instruction multiple data...

chapter

A Scalable Parameterized NoC Emulator Built Upon Xilinx Virtex-7 FPGA

Ming Zhu, Yingtao Jiang, Mei Yang, Louie De Luna

2017 25th International Conference on Systems Engineering (ICSEng) > 287 - 290

2017 25th International Conference on Systems Engineering (ICSEng)

A number of critical design decisions, such as network topology, buffer sizes, flow control mechanism and so on so forth, have to be evaluated in any NoC the design. Designs and verifications of NoCs are based on either software simulations, which are extremely slow and inaccurate for complex models, or hardware emulations using low/mid-class FPGAs, where the scalability of the NoC system is intensively...

Keywords:
FIELD PROGRAMMABLE GATE ARRAYS
THROUGHPUT

Publication date

Set your own date range

Content availability

Available (682)
None (9)

Keywords

HARDWARE (277)
FPGA (253)
COMPUTER ARCHITECTURE (186)
CLOCKS (140)
ALGORITHM DESIGN AND ANALYSIS (108)
RANDOM ACCESS MEMORY (101)
CRYPTOGRAPHY (86)
PIPELINES (71)
REGISTERS (68)
PIPELINE PROCESSING (65)
TABLE LOOKUP (65)
DECODING (55)
ENCRYPTION (55)
MEMORY MANAGEMENT (49)
PARALLEL PROCESSING (40)
SOFTWARE (39)
DIGITAL SIGNAL PROCESSING (36)
IP NETWORKS (35)
PROTOCOLS (35)
DELAY (34)
MIMO (32)
BANDWIDTH (31)
FIELD PROGRAMMABLE GATE ARRAY (31)
LOGIC DESIGN (30)
OPTIMIZATION (30)
ENGINES (29)
ADDERS (28)
ENCODING (28)
KERNEL (28)
RECONFIGURABLE ARCHITECTURES (28)
COMPLEXITY THEORY (26)
PARALLEL ARCHITECTURES (26)
POWER DEMAND (26)
PROGRAM PROCESSORS (26)
DATA MINING (25)
FPGA IMPLEMENTATION (25)
GENERATORS (24)
SIGNAL PROCESSING ALGORITHMS (24)
APPLICATION SPECIFIC INTEGRATED CIRCUITS (23)
PARITY CHECK CODES (23)
STANDARDS (23)
ARRAYS (22)
LOGIC GATES (22)
SECURITY (22)
ROUTING (21)
NIST (19)
SYSTEM-ON-CHIP (19)
AES (18)
MATHEMATICAL MODEL (18)
SWITCHES (18)
SYNCHRONIZATION (18)
VLSI (18)
DETECTORS (17)
MICROPROCESSOR CHIPS (17)
RESOURCE MANAGEMENT (17)
SHA-3 (17)
CIPHERS (16)
REAL TIME SYSTEMS (16)
EQUATIONS (15)
MIMO COMMUNICATION (15)
PERFORMANCE EVALUATION (15)
POLYNOMIALS (15)
TELECOMMUNICATION NETWORK ROUTING (15)
ACCELERATION (14)
INTERNET (14)
NETWORK-ON-CHIP (14)
PATTERN MATCHING (14)
RADIATION DETECTORS (14)
SYSTEM-ON-A-CHIP (14)
ADVANCED ENCRYPTION STANDARD (13)
COMPUTATIONAL MODELING (13)
DELAYS (13)
HARDWARE DESCRIPTION LANGUAGES (13)
HARDWARE IMPLEMENTATION (13)
MULTIPLEXING (13)
POWER CONSUMPTION (13)
WIRELESS COMMUNICATION (13)
FFT (12)
INDEXES (12)
INTEGRATED CIRCUIT DESIGN (12)
PACKET CLASSIFICATION (12)
PIPELINING (12)
REAL-TIME SYSTEMS (12)
SHIFT REGISTERS (12)
VHDL (12)
BENCHMARK TESTING (11)
MAGNETIC CORES (11)
PIPELINE (11)
SIGNAL PROCESSING (11)
SRAM CHIPS (11)
ASIC (10)
BIT ERROR RATE (10)
DYNAMIC PARTIAL RECONFIGURATION (10)
ETHERNET NETWORKS (10)
FPGAS (10)
HASH FUNCTION (10)
IMAGE CODING (10)
ITERATIVE DECODING (10)
more

INFONA - science communication portal

Search results

Maximizing CNN accelerator efficiency through resource partitioning

Architecting a multi-GHz real-time RF streaming system

Memory compact high-speed QC-LDPC decoder

High-performance implementation of an HMAC processor based on SHA-3 Hash function

High throughput AES encryption/decryption with efficient reordering and merging techniques

Mapping of P4 match action tables to FPGA

Learning-based interconnect-aware dataflow accelerator optimization

A security library for FPGA interlays

A generic high throughput architecture for stream processing

Scalable high-performance architecture for convolutional ternary neural networks on FPGA

A chaotic time-delay system based digital RNG and integrated autonomous test suite

Hardware-oriented turbo-product codes decoder architecture

Robust throughput boosting for low latency dynamic partial reconfiguration

Proposition and evaluation of a real-time generic architecture for a laser stripe detection system on FPGA

High-throughput FPGA implementation of the CCSDS 122.0-B-1 compression standard

High level synthesis using vivado HLS for optimizations of SHA-3

Packet Classification with Limited Memory Resources

FPGA-based frequent items counting using matrix of equality comparators

A high-performance FPGA-based LDPC decoder for solid-state drives

A Scalable Parameterized NoC Emulator Built Upon Xilinx Virtex-7 FPGA

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options