Search results

Items from 1 to 20 out of 44 results

article

Computer Generation of High Throughput and Memory Efficient Sorting Designs on FPGA

Ren Chen, Viktor K. Prasanna

IEEE Transactions on Parallel and Distributed Systems > 2017 > 28 > 11 > 3100 - 3113

Accelerating sorting using dedicated hardware to fully utilize the memory bandwidth for Big Data applications has gained much interest in the research community. Recently, parallel sorting networks have been widely employed in hardware implementations due to their high data parallelism and low control overhead. In this paper, we propose a systematic methodology for mapping large-scale bitonic sorting...

chapter

A generic high throughput architecture for stream processing

Christes Rousopoulos, Ektoras Karandeinos, Grigorios Chrysos, Apostolos Dollas, more

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 5

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Stream join is a fundamental and computationally expensive data mining operation for relating information from different data streams. This paper presents two FPGA-based architectures that accelerate stream join processing. The proposed hardware-based systems were implemented on a multi-FPGA hybrid system with high memory bandwidth. The experimental evaluation shows that our proposed systems can outperform...

chapter

Scalable high-performance architecture for convolutional ternary neural networks on FPGA

Adrien Prost-Boucle, Alban Bourge, Frederic Petrot, Hande Alemdar, more

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 7

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Thanks to their excellent performances on typical artificial intelligence problems, deep neural networks have drawn a lot of interest lately. However, this comes at the cost of large computational needs and high power consumption. Benefiting from high precision at acceptable hardware cost on these difficult problems is a challenge. To address it, we advocate the use of ternary neural networks (TNN)...

chapter

OpenCL-based design pattern for line rate packet processing

Jehandad Khan, Peter Athanas, Skip Booth, John Marshall

2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP) > 190 - 194

2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP)

The ever changing nature of network technology requires a flexible platform that can change as the technology evolves. In this work, a complete networking switch designed in OpenCL is presented, identifying several high-level constructs that form the building blocks of any network application targeting FPGAs. These include the notion of an on-chip global memory and kernels constantly processing data...

chapter

Highly parallel bitmap-based regular expression matching for text analytics

Xuan-Thuan Nguyen, Hong-Thu Nguyen, Katsumi Inoue, Osamu Shimojo, more

2017 IEEE International Symposium on Circuits and Systems (ISCAS) > 1 - 4

2017 IEEE International Symposium on Circuits and Systems (ISCAS)

Text analytics has become increasingly important in the past few years because of the substantial growth in the amount of research, business, and government needs. An efficient text analytics system is likely to require high-powered regular expression matching (REGEX), as REGEX operations dominate the whole execution time. Some approaches have exploited the parallelism of graphic processing units...

chapter

P5: Programmable Parsers with Packet-level Parallel Processing for FPGA-based Switches

Junnan Li, Zhigang Sun, Biao Han

2017 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS) > 107 - 108

2017 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS)

This paper presents P5, a programmable packet parser with packet-level parallel processing for FPGA-based switches. P5 overcomes both limitations. First, P5 has the programmability of dynamically updating parsing algorithms at run-time. Second, P5 exploits packet-level parallelism in the bottleneck of parsing pipeline to compensate FPGA’s low clock frequency, and reduces resource consumption through...

chapter

Comparison of parallel image scanning methods for achieving better throughput

Mohammad Rafi, Najeeb-ud-Din

2017 4th International Conference on Signal Processing and Integrated Networks (SPIN) > 100 - 103

2017 4th International Conference on Signal Processing and Integrated Networks (SPIN)

Higher throughput is always desired in real time image processing applications. There are many ways to achieve higher throughput. However, if we have additional resources and memory bandwidth available, parallelism can be applied to achieve it. In this work, we have presented two image scanning methods that carry out parallelism to double the throughput of any architecture. Partitioned image scanning...

chapter

A novel multiprocessor architecture for k-means clustering algorithm based on network-on-chip

Sajid Gul Khawaja, Muhammad Usman Akram, Shoab A. Khan, Ammar Ajmal

2016 19th International Multi-Topic Conference (INMIC) > 1 - 5

2016 19th International Multi-Topic Conference (INMIC)

The k-means clustering is one of the widely used algorithms in Data Mining and Machine Learning domains due to the simplicity, efficiency and scalability involved. The algorithm allocates N data-points or samples to k-clusters employing the minimum distances from respective cluster centroids. Distance calculation is intrinsically a computationally intensive task which is usually accelerated by using...

chapter

Design and implementation of embedded DAQ using spatial parallelism on FPGA for better throughput

Janice Jia Min, Muataz H. Salih, Zheng Ng, Torry Kho, more

2016 3rd International Conference on Electronic Design (ICED) > 275 - 280

2016 3rd International Conference on Electronic Design (ICED)

Data acquisition (DAQ) is the process of acquire analog signals from different types of sources and further process the acquired signals through personal computer (PC) in digital form. Compared to traditional measurement system, PC-based DAQ system provides a more flexible and cost-effective measurement solution to the industry and utilizes the efficiency, processing power and connectivity capabilities...

chapter

Multi-GSPS FFTs using FPGAs

Michael Parker, Simon Finn, Hong Shan Neoh

2016 IEEE National Aerospace and Electronics Conference (NAECON) and Ohio Innovation Summit (OIS) > 430 - 436

2016 IEEE National Aerospace and Electronics Conference (NAECON) and Ohio Innovation Summit (OIS)

This paper describes the implementation of a high throughput FFTs implemented on FPGAs, using a modified version of the Radix 2^N architecture. The implementation uses a synthesis method which supports “super-sampling” to provide very high throughput. Special vector structures in the tools and hardware architecture are supported where complex vectors form the input on each clock cycle, and multiple...

chapter

High-speed decompression architecture of compressed HTTP streams for the internet routers

Hironori Okano, Hayato Yamaki, Hiroaki Nishi

2016 International Conference on FPGA Reconfiguration for General-Purpose Computing (FPGA4GPC) > 31 - 36

2016 International Conference on FPGA Reconfiguration for General-Purpose Computing (FPGA4GPC)

In recent years, studies of DPI have been carried out actively. HTTP packets, which are a kind of DPI target, include GZIP compressed packets, and multi-streamed GZIP compressed HTTP cannot be analyzed directly on routers. Moreover, wire-rate processing is required to achieve on-router analysis. In this paper, HTTP decompressing architecture on routers supporting 40Gbps network is considered, and...

chapter

A self-aware data compression system on FPGA in Hadoop

Yubin Li, Yuliang Sun, Guohao Dai, Yuzhi Wang, more

2015 International Conference on Field Programmable Technology (FPT) > 196 - 199

2015 International Conference on Field Programmable Technology (FPT)

With the exponential growth of data size, data storage and analysis have been exposed to more challenges due to the lack of disk capacity and the limited network bandwidth. Data compression technique provides a good solution to mitigate these effects. In this paper, we propose a self-aware data compression system on FPGA for typical data warehousing, such as Hive, with column stored data and multi-threading...

chapter

A hybrid design for high performance large-scale sorting on FPGA

Ajitesh Srivastava, Ren Chen, Viktor K. Prasanna, Charalampos Chelmis

2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig) > 1 - 6

2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig)

Sorting is a key kernel in numerous big data application including database operations, graphs and text analytics. Due to low control overhead, parallel bitonic sorting networks are usually employed for hardware implementations to accelerate sorting. Although a typical implementation of merge sort network can lead to low latency and small memory usage, it suffers from low throughput due to the lack...

chapter

Challenges for 100 Gbit/s end to end communication: Increasing throughput through parallel processing

Steffen Buchner, Jorg Nolte, Rolf Kraemer, Lukasz Lopacinski, more

2015 IEEE 40th Conference on Local Computer Networks (LCN) > 398 - 401

2015 IEEE 40th Conference on Local Computer Networks (LCN 2015)

Today's applications and services become more dependent on fast wireless communication, for the upcoming years data-rate demands of 100Gbit/s can be easily expected. However, fulfilling that demand is a task which cannot simply be solved by upscaling existing technologies. While most of the research tackles the challenges regarding the transmission technology from the physical layer up to base-band...

chapter

Effectiveness of matrix and pipeline FPGA-based arithmetic components of safety-related systems

Julia Drozd, Oleksandr Drozd, Svetlana Antoshchuk, Alex Kushnerov, more

2015 IEEE 8th International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS) > 2 > 785 - 789

2015 IEEE 8th International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS)

The paper is devoted to design of the digital components for safety-related instrumentation and control systems using the modern CAD tools. Traditionally, the digital components are built with matrix parallelism that reduces fault tolerance of circuits and safety of systems in their checkability. Circuits with bitwise pipeline data processing have advantage in checkability, but are considered as less...

chapter

Framework for parameter analysis of FPGA-based image processing architectures

Marc Reichenbach, Benjamin Pfundt, Dietmar Fey

2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) > 96 - 102

2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)

Image processing algorithms which only work on a local neighbourhood are nearly used in every image processing application. Very often several iterations are performed on a fixed neighbourhood which leads to the description of stencil codes. A promising approach in embedded systems is to use the massively parallel computation power of an FPGA for this kind of algorithms. This not only speeds up processing...

chapter

NF-Dedupe: A novel no-fingerprint deduplication scheme for flash-based SSDs

Zhengguo Chen, Zhiguang Chen, Nong Xiao, Fang Liu

2015 IEEE Symposium on Computers and Communication (ISCC) > 588 - 594

2015 IEEE Symposium on Computers and Communication (ISCC)

NAND flash-based Solid State Drives (SSDs) have been widely deployed in data centers of cloud computing due to their high performance compared with hard disks, while the limited lifespan of flash memory makes SSDs not very suitable for write-intensive applications. Deduplication is an effective method used to reduce the write traffic of applications thus can be used to extend the lifespan of SSDs...

chapter

Investigation of suitable DSP architecture for efficient FPGA implementation of FIR filter

Mahesh Kadam, Kishor Sawarkar, Sudhakar Mande

2015 International Conference on Communication, Information & Computing Technology (ICCICT) > 1 - 4

2015 International Conference on Communication, Information & Computing Technology (ICCICT)

In this paper, we have investigated pipeline and parallel processing architectures of finite impulse response (FIR) filter for efficient field programmable gate array (FPGA) implementation. Our simulation results shows that parallel processing architecture is more efficient as compared to pipeline architecture. Further, it is shown that fast FIR architecture is most suitable as compared to conventional...

chapter

High throughput energy efficient parallel FFT architecture on FPGAs

Ren Chen, Neungsoo Park, Viktor K. Prasanna

2013 IEEE High Performance Extreme Computing Conference (HPEC) > 1 - 6

2013 IEEE High Performance Extreme Computing Conference (HPEC)

Throughput is a key performance metric for streaming FFT architectures. However, increasing spatial parallelism to improve throughput introduces complex routing, thus resulting in high power consumption. In this paper, we propose a high throughput energy efficient parallel FFT architecture based on Cooley-Tukey algorithm. Multiple pipeline FFT processors using time-multiplexing are utilized to perform...

chapter

High throughput implementations of cryptography algorithms on GPU and FPGA

Vivek Venugopal, Devu Manikantan Shila

2013 IEEE International Instrumentation and Measurement Technology Conference (I2MTC) > 723 - 727

2013 IEEE International Instrumentation and Measurement Technology Conference (I2MTC)

Cryptography algorithms are ranked by their speed in encrypting/decrypting data and their robustness to withstand attacks. Real-time processing of data encryption/decryption is essential in network based applications to keep pace with the input data inhalation rate. The encryption/decryption steps are computationally intensive and exhibit high degree of parallelism. Field programmable gate arrays...

Data set:
ieee
Keywords:
THROUGHPUT
FIELD PROGRAMMABLE GATE ARRAYS
PARALLEL PROCESSING

Publication date

Set your own date range

Publication type

book (40)
article (4)

Keywords

HARDWARE (23)
FPGA (14)
COMPUTER ARCHITECTURE (13)
CLOCKS (7)
PROGRAM PROCESSORS (6)
DATA MINING (5)
KERNEL (5)
PIPELINES (5)
BANDWIDTH (4)
COPROCESSORS (4)
CRYPTOGRAPHY (4)
RANDOM ACCESS MEMORY (4)
ADDERS (3)
COMPUTATIONAL MODELING (3)
DIGITAL SIGNAL PROCESSING (3)
MEMORY MANAGEMENT (3)
OPTIMIZATION (3)
PROTOCOLS (3)
RECONFIGURABLE ARCHITECTURES (3)
SOFTWARE (3)
SORTING (3)
TABLE LOOKUP (3)
ACCELERATION (2)
ARRAYS (2)
BENCHMARK TESTING (2)
COMPILER (2)
CONTEXT (2)
DIGITAL SIGNAL PROCESSORS (2)
DISCRETE WAVELET TRANSFORMS (2)
ENERGY EFFICIENCY (2)
FIELD PROGRAMMABLE GATE ARRAY (2)
FIELD-PROGRAMMABLE GATE ARRAY (2)
FIELD-PROGRAMMABLE GATE ARRAYS (2)
LOGIC GATES (2)
POWER DEMAND (2)
REAL TIME SYSTEMS (2)
RECONFIGURABLE HARDWARE (2)
RESOURCE UTILIZATION (2)
SERVERS (2)
STREAMING MEDIA (2)
SYNCHRONIZATION (2)
100GBIT/S WIRELESS (1)
2-D FAST FOURIER TRANSFORM (1)
2D DWT ARCHITECTURE (1)
2D-FFT (1)
36-PROCESSOR SYSTEM (1)
3D IMAGING (1)
ADAPTABLE ARCHITECTURES (1)
ADAPTATION MODEL (1)
ADVANCED ENCRYPTION STANDARD (1)
AERIAL IMAGE (1)
AES (1)
ALGORITHM DESIGN AND ANALYSIS (1)
ANALYTICAL MODEL (1)
ANALYTICAL MODELS (1)
ARCHITECTURE (1)
ARITHMETIC COMPLEXITY (1)
ARITHMETIC INTENSIVE ALGORITHMS (1)
ARITHMETICAL DIGITAL COMPONENT (1)
ARRIA 10 (1)
AUTOMATIC GENERATION STREAMING ARCHITECTURES (1)
AUTOMATIC KERNEL REPLICATION (1)
AUTOMATIC VOLTAGE CONTROL (1)
BIOINSPIRED DYNAMIC TASK REPLICATION ALGORITHM (1)
BIOMEDICAL IMAGING TECHNOLOGY (1)
BIT RATE 10 GBIT/S (1)
BITONIC SORTING NETWORK (1)
BRIDGE MODULE (1)
BRIDGES (1)
BROOK STREAMING LANGUAGE (1)
CAMERAS (1)
CELLULAR ARCHITECTURE (1)
CIPHERED COMMUNICATION (1)
CLOS NETWORK (1)
CLOUD COMPUTING (1)
CLUSTERING ALGORITHMS (1)
COARSE-GRAIN PARALLELISM (1)
COMMERCIAL C2H BEHAVIORAL SYNTHESIS COMPILER (1)
COMPLEXITY (1)
COMPLEXITY THEORY (1)
COMPUTATIONAL COMPLEXITY (1)
COMPUTER GENERATION (1)
COMPUTER GRAPHICS (1)
COMPUTER VISION (1)
COMPUTERS (1)
CONFIGURABLE RISC ARCHITECTURE (1)
CONTROL PROCESSOR (1)
COPPER (1)
COPROCESSOR (1)
COPROCESSOR ARCHITECTURE (1)
CORRELATED MULTISTREAM PROCESSING (1)
DATA ACQUISITION (1)
DATA ACQUISITION (DAQ) (1)
DATA COMPRESSION (1)
DATA DEPENDENCE (1)
DATA REDUCTION (1)
DATA REPRESENTATION (1)
more

INFONA - science communication portal

Search results

Computer Generation of High Throughput and Memory Efficient Sorting Designs on FPGA

A generic high throughput architecture for stream processing

Scalable high-performance architecture for convolutional ternary neural networks on FPGA

OpenCL-based design pattern for line rate packet processing

Highly parallel bitmap-based regular expression matching for text analytics

P5: Programmable Parsers with Packet-level Parallel Processing for FPGA-based Switches

Comparison of parallel image scanning methods for achieving better throughput

A novel multiprocessor architecture for k-means clustering algorithm based on network-on-chip

Design and implementation of embedded DAQ using spatial parallelism on FPGA for better throughput

Multi-GSPS FFTs using FPGAs

High-speed decompression architecture of compressed HTTP streams for the internet routers

A self-aware data compression system on FPGA in Hadoop

A hybrid design for high performance large-scale sorting on FPGA

Challenges for 100 Gbit/s end to end communication: Increasing throughput through parallel processing

Effectiveness of matrix and pipeline FPGA-based arithmetic components of safety-related systems

Framework for parameter analysis of FPGA-based image processing architectures

NF-Dedupe: A novel no-fingerprint deduplication scheme for flash-based SSDs

Investigation of suitable DSP architecture for efficient FPGA implementation of FIR filter

High throughput energy efficient parallel FFT architecture on FPGAs

High throughput implementations of cryptography algorithms on GPU and FPGA

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options