Search results

Items from 1 to 20 out of 21 results

chapter

A 142MOPS/mW integrated programmable array accelerator for smart visual processing

Satyajit Das, Davide Rossi, Kevin J. M. Martin, Philippe Coussy, more

2017 IEEE International Symposium on Circuits and Systems (ISCAS) > 1 - 4

2017 IEEE International Symposium on Circuits and Systems (ISCAS)

Due to increasing demand of low power computing, and diminishing returns from technology scaling, industry and academia are turning with renewed interest toward energy-efficient programmable accelerators. This paper proposes an Integrated Programmable-Array accelerator (IPA) architecture based on an innovative execution model, targeted to accelerate both data and control-flow parts of deeply embedded...

chapter

Tessellation-based multi-block memory mapping scheme for high-level synthesis with FPGA

auJuan Escobedo, auMingjie Lin

2016 International Conference on Field-Programmable Technology (FPT) > 125 - 132

2016 International Conference on Field-Programmable Technology (FPT)

For many intensive computing tasks, simultaneous data access into multi-dimensional data arrays is highly restricted by its data mapping strategy and memory port constraint. As such, to increase memory accessing bandwidth, innovative memory partitioning and mapping algorithms have been proposed to simultaneously access multiple memory blocks through physically distributing data elements in the same...

chapter

High-Level Designs of Complex FIR Filters on FPGAs for the SKA

Haomiao Wang, Joao Gante, Ming Zhang, Gabriel Falcao, more

2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS) > 797 - 804

High-end FPGAs are widely adopted as hardware accelerators, due to their power efficiency, flexibility, and high-performance computing ability. They are, therefore, extremely useful devices for a project with challenges and constraints such as the Square Kilometre Array (SKA). However, the traditional design methods require expert hardware knowledge and long development times for each of the SKA's...

chapter

QuickDough: A rapid FPGA loop accelerator design framework using soft CGRA overlay

Cheng Liu, Ho-Cheung Ng, Hayden Kwok-Hay So

2015 International Conference on Field Programmable Technology (FPT) > 56 - 63

2015 International Conference on Field Programmable Technology (FPT)

The use of FPGAs as compute accelerators has been demonstrated by numerous researchers as an effective solution to meet the performance requirement across many application domains. However, the design productivity of developing FPGA accelerators remains much lower compared to the use of a typical software development flow. Although the use of the high-level design tools may partly alleviate this shortcoming,...

chapter

Exploring pipe implementations using an OpenCL framework for FPGAs

Vincent Mirian, Paul Chow

2015 International Conference on Field Programmable Technology (FPT) > 112 - 119

2015 International Conference on Field Programmable Technology (FPT)

In the last decade, OpenCL has sparked the interest of the computing world as it is a language based on an open standard that can run on many different heterogeneous platforms. This standard is continuously evolving to adapt to various use cases of different platforms. For example, with requests from the FPGA community, the pipe construct was added to the standard to facilitate the implementation...

chapter

Design of Real-Time Embedded Drive System for Infrared Image Array

Chunling Yang, Xiaoming Qiu, Huajie Zhang

2015 Fifth International Conference on Instrumentation and Measurement, Computer, Communication and Control (IMCCC) > 1796 - 1799

2015 Fifth International Conference on Instrumentation & Measurement, Computer, Communication and Control (IMCCC)

In order to improve the real-time performance and reliability of the drive system for infrared image array, this paper designs an embedded drive system. With MPC8315 as the processing core, this system takes reflective memory network as the transmission unit. In order to verify and analyze the performance of the embedded drive system for the infrared image array, this paper sets up a test platform...

chapter

An FPGA-based systolic array to accelerate the BWA-MEM genomic mapping algorithm

Ernst Joachim Houtgast, Vlad-Mihai Sima, Koen Bertels, Zaid Al-Ars

2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) > 221 - 227

2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)

We present the first accelerated implementation of BWA-MEM, a popular genome sequence alignment algorithm widely used in next generation sequencing genomics pipelines. The Smith-Waterman-like sequence alignment kernel requires a significant portion of overall execution time. We propose and evaluate a number of FPGA-based systolic array architectures, presenting optimizations generally applicable to...

chapter

Cellular sensor-processor array based visual collision warning sensor

Akos Zarandy, Mate Nemeth, Borbala Pencz, Zoltan Nagy, more

2015 IEEE International Symposium on Circuits and Systems (ISCAS) > 1973 - 1976

2015 IEEE International Symposium on Circuits and Systems (ISCAS)

Autonomous UAVs need on-board vision system to be able to navigate, avoid collisions, and execute missions. Small UAVs can carry small form factor vision system with low power consumption due to natural payload limitations. Therefore it is a natural idea to use cellular sensor-processor arrays to implement the necessary vision functions. In this paper, we present a UAV collision warning algorithm...

chapter

Data-reuse optimizations for pipelined tiling with parametric tile sizes

Alexandre Isoard

2014 23rd International Conference on Parallel Architecture and Compilation (PACT) > 509 - 510

2014 23rd International Conference on Parallel Architecture and Compilation (PACT)

Todays' hardware diversity exacerbates the need for optimizing compilers. A problem that arises when exploiting hardware accelerators (FPGA, GPU, dedicated boards) is how to automatically perform kernel/function offloading or outlining (as opposed to function inlining). The principle is to outsource part of the computation (the kernel to be performed on the accelerator) to a more efficient but more...

chapter

Direct virtual memory access from FPGA for high-productivity heterogeneous computing

Ho-Cheung Ng, Yuk-Ming Choi, Hayden Kwok-Hay So

2013 International Conference on Field-Programmable Technology (FPT) > 458 - 461

2013 International Conference on Field-Programmable Technology (FPT)

Heterogeneous computing utilizing both CPU and FPGA requires access to data in the main memory from both devices. While a typical system relies on software executing on the CPU to orchestrate all data movements between the FPGA and the main memory, our demo presents a complementary FPGA-centric approach that allows gateware to directly access the virtual memory space as part of the executing process...

chapter

Compiled multithreaded data paths on FPGAs for dynamic workloads

Robert J. Halstead, Walid Najjar

2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES) > 1 - 10

2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES)

Hardware supported multithreading can mask memory latency by switching the execution to ready threads, which is particularly effective on irregular applications. FPGAs provide an opportunity to have multithreaded data paths customized toeach individual application. In this paper we describe the compiler generation of these hardware structures from a C subset targeting a Convey HC-2ex machine. We describe...

chapter

A Reconfigurable Computing Approach for Efficient and Scalable Parallel Graph Exploration

Brahim Betkaoui, Yu Wang, David B. Thomas, Wayne Luk

2012 IEEE 23rd International Conference on Application-Specific Systems, Architectures and Processors > 8 - 15

2012 IEEE 23rd International Conference on Application-specific Systems, Architectures and Processors (ASAP)

In many application domains, data are represented using large graphs involving millions of vertices and billions of edges. Graph exploration algorithms, such as breadth-first search (BFS), are largely dominated by memory latency and are challenging to process efficiently. In this paper, we present a reconfigurable hardware methodology for efficient parallel processing of large-scale graph exploration...

chapter

Multi-stage parallel processing of design element access tasks in FPGA-based logic emulation systems

Somnath Banerjee, Tushar Gupta

2011 3rd Asia Symposium on Quality Electronic Design (ASQED) > 301 - 309

2011 3rd Asia Symposium on Quality Electronic Design (ASQED 2011)

In FPGA based logic emulation systems, effective verification performance not only depends on the frequency at which the design clocks can be advanced, but also on the efficiency of various design element access tasks initiated by associated SW applications like high level testbench, GUI etc. Although existing emulation systems achieve high degree of parallelism in model execution by partitioning...

chapter

Multilevel Granularity Parallelism Synthesis on FPGAs

A Papakonstantinou, Yun Liang, J A Stratton, K Gururaj, more

2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines > 178 - 185

2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM 2011)

Recent progress in High-Level Synthesis (HLS) techniques has helped raise the abstraction level of FPGA programming. However implementation and performance evaluation of the HLS-generated RTL, involves lengthy logic synthesis and physical design flows. Moreover, mapping of different levels of coarse grained parallelism onto hardware spatial parallelism affects the final FPGA-based performance both...

chapter

An algorithm-architecture co-design framework for gridding reconstruction using FPGAs

Srinidhi Kestur, Kevin Irick, Sungho Park, Ahmed Al Maashri, more

2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC) > 585 - 590

2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC)

Gridding is a method of interpolating irregularly sampled data on to a uniform grid and is a critical image reconstruction step in several applications which operate on non-Cartesian sampled data. In this paper, we present an algorithm-architecture co-design framework for accelerating gridding using FPGAs. We present a parameterized hardware library for accelerating gridding to support both arbitrary...

chapter

Accelerating the Nonuniform Fast Fourier Transform Using FPGAs

Srinidhi Kestur, Sungho Park, Kevin M Irick, Vijaykrishnan Narayanan

2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines > 19 - 26

2010 IEEE 18th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM 2010)

We present an FPGA accelerator for the Non-uniform Fast Fourier Transform, which is a technique to reconstruct images from arbitrarily sampled data. We accelerate the compute-intensive interpolation step of the NuFFT Gridding algorithm by implementing it on an FPGA. In order to ensure efficient memory performance, we present a novel FPGA implementation for Geometric Tiling based sorting of the arbitrary...

article

A Heterogeneous Digital Signal Processor for Dynamically Reconfigurable Computing

Davide Rossi, Fabio Campi, Simone Spolzino, Stefano Pucillo, more

IEEE Journal of Solid-State Circuits > 2010 > 45 > 8 > 1615 - 1626

This paper describes a System on Chip implementation of a reconfigurable digital signal processor. The device is suitable for execution of a wide range of applications exploiting a balanced mix of heterogeneous reconfigurable fabrics merged together by a flexible and efficient communication infrastructure based on a 64-bit Network On Chip. The SoC combines a fine grain embedded FPGA, a mid grain configurable...

chapter

Enhancements to FPGA design methodology using streaming

F. Plavec, Z. Vranesic, S. Brown

2009 International Conference on Field Programmable Logic and Applications > 294 - 301

2009 International Conference on Field Programmable Logic and Applications (FPL)

Capacity of FPGAs has grown significantly, leading to increased complexity of designs targeting these chips. Traditional FPGA design methodology using HDLs is no longer sufficient and new methodologies are being sought. An attractive possibility is to use streaming languages. Streaming languages group data into streams, which are processed by computational nodes called kernels. They are suitable for...

chapter

An automated algorithm to generate stream programs

Lei Gao, G. Mittal, D. Zaretsky, D. Schonfeld, more

2009 IEEE International Symposium on Circuits and Systems > 1505 - 1508

2009 IEEE International Symposium on Circuits and Systems - ISCAS 2009

With the proliferation of reconfigurable systems and flexible memory architectures, there has been intense interest in stream systems. While the existing stream systems require the programs to be written using special models, this paper demonstrates an approach to automatically generate stream programs from existing applications written for non-stream scalar processors. As a part of this approach,...

chapter

Fully-pipelined efficient architectures for FPGA realization of discrete Hadamard transform

P.K. Meher, J.C. Patra

2008 International Conference on Application-Specific Systems, Architectures and Processors > 43 - 48

2008 International Conference on Application-Specific Systems, Architectures and Processors (ASAP)

Fully-pipelined simple modular structures are presented in this paper for efficient hardware realization of discrete Hadamard transform (HT). From the kernel matrix of HT, we have derived four different pipelined modular designs for transform length N = 4. It is shown further that the HT of transform-length N = 8 can be obtained from two 4-point HT modules, and similarly, the HT of transform-length...

Data set:
ieee
Keywords:
ARRAYS
KERNEL
FIELD PROGRAMMABLE GATE ARRAYS

Publication date

Set your own date range

Publication type

book (20)
article (1)

Keywords

FPGA (8)
HARDWARE (8)
PARALLEL PROCESSING (3)
ALGORITHM DESIGN AND ANALYSIS (2)
ARRAY SIGNAL PROCESSING (2)
BEE3 (2)
CLOCKS (2)
COMPUTATIONAL MODELING (2)
DATA MINING (2)
INSTRUCTION SETS (2)
INTERPOLATION (2)
LOGIC DESIGN (2)
MEMORY MANAGEMENT (2)
PIPELINES (2)
RANDOM ACCESS MEMORY (2)
RECONFIGURABLE ARCHITECTURES (2)
ABSTRACTION LEVEL (1)
ACCELERATION (1)
ACCELERATORS (1)
ADDERS (1)
AIRCRAFT (1)
ANALYTICAL MODELS (1)
ARM PROCESSOR (1)
AUTOMATED ALGORITHM (1)
AUTOMATIC KERNEL REPLICATION (1)
BANDWIDTH (1)
BEE3 PLATFORM (1)
BIOINFORMATICS (1)
BREADTH-FIRST SEARCH (BFS) (1)
CARTESIAN (1)
CENTRAL PROCESSING UNIT (1)
CLIENT-SERVER SYSTEMS (1)
COARSE GRAIN RECONFIGURABLE ARRAY (1)
COARSE GRAINED DYNAMICALLY RECONFIGURABLE PE ARRAY (1)
COARSE GRAINED PARALLELISM (1)
CODE TRANSFORMATION (1)
COLLISION AVOIDANCE (1)
COMPILER (1)
COMPUTER ARCHITECTURE (1)
CONVOLUTION (1)
COST MODELS (1)
CUDA KERNEL MAPPING (1)
DATA TRANSFER (1)
DATA TRANSLATION ARCHITECTURE (1)
DATA-REUSE (1)
DESIGN LAYOUT INFORMATION (1)
DESIGN SPACE EXPLORATION (1)
DESIGN SPACE SEARCH HEURISTIC (1)
DIGITAL LOGIC DESIGN (1)
DIGITAL SIGNAL PROCESSING CHIPS (1)
DIGITAL SIGNAL PROCESSORS (1)
DISCRETE HADAMARD TRANSFORM (1)
DISCRETE TRANSFORMS (1)
DRIVES (1)
DYNAMIC COORDINATE-GENERATOR (1)
DYNAMICALLY RECONFIGURABLE COMPUTING (1)
EDGE DETECTION (1)
EMULATION (1)
ESTIMATION (1)
FABRICS (1)
FAST FOURIER TRANSFORMS (1)
FILE SERVERS (1)
FINE GRAIN EMBEDDED FPGA (1)
FINE-GRAIN CELLULAR PROCESSOR ARRAY (1)
FINITE IMPULSE RESPONSE FILTERS (1)
FIR FILTERS (1)
FLEXIBLE MEMORY ARCHITECTURES (1)
FOCAL-PLANE SENSOR-PROCESSOR (1)
FPGA ACCELERATOR (1)
FPGA BASED PLATFORM (1)
FPGA PROGRAMMING (1)
FPGA-BASED ACCELERATOR (1)
FULLY-PIPELINED EFFICIENT ARCHITECTURE (1)
GENOMICS (1)
GEOMETRIC TILING (1)
GEOMETRIC TILING BASED SORTING (1)
GLOBALLY ASYNCHRONOUS LOCALLY SYNCHRONOUS (1)
GPU (1)
GRAPHICS PROCESSING UNITS (1)
GRIDDING (1)
HADAMARD TRANSFORMS (1)
HARDWARE ACCELERATION (1)
HARDWARE DESIGN LANGUAGES (1)
HARDWARE SPATIAL PARALLELISM (1)
HDL (1)
HETEROGENEOUS DIGITAL SIGNAL PROCESSOR (1)
HETEROGENEOUS RECONFIGURABLE FABRICS (1)
HEURISTIC ALGORITHMS (1)
HIGH PERFORMANCE RECONFIGURABLE COMPUTING (1)
HIGH-LEVEL SYNTHESIS TECHNIQUE (1)
HIGH-LEVEL SYTNTHESIS (1)
IMAGE EDGE DETECTION (1)
IMAGE MOTION ANALYSIS (1)
IMAGE RECONSTRUCTION (1)
INFRARED IMAGE; ARRAY; MPC8315; REAL-TIME PERFORMANCE (1)
INTEGRATED CIRCUIT LAYOUT (1)
INTERNET (1)
more

INFONA - science communication portal

Search results

A 142MOPS/mW integrated programmable array accelerator for smart visual processing

Tessellation-based multi-block memory mapping scheme for high-level synthesis with FPGA

High-Level Designs of Complex FIR Filters on FPGAs for the SKA

QuickDough: A rapid FPGA loop accelerator design framework using soft CGRA overlay

Exploring pipe implementations using an OpenCL framework for FPGAs

Design of Real-Time Embedded Drive System for Infrared Image Array

An FPGA-based systolic array to accelerate the BWA-MEM genomic mapping algorithm

Cellular sensor-processor array based visual collision warning sensor

Data-reuse optimizations for pipelined tiling with parametric tile sizes

Direct virtual memory access from FPGA for high-productivity heterogeneous computing

Compiled multithreaded data paths on FPGAs for dynamic workloads

A Reconfigurable Computing Approach for Efficient and Scalable Parallel Graph Exploration

Multi-stage parallel processing of design element access tasks in FPGA-based logic emulation systems

Multilevel Granularity Parallelism Synthesis on FPGAs

An algorithm-architecture co-design framework for gridding reconstruction using FPGAs

Accelerating the Nonuniform Fast Fourier Transform Using FPGAs

A Heterogeneous Digital Signal Processor for Dynamically Reconfigurable Computing

Enhancements to FPGA design methodology using streaming

An automated algorithm to generate stream programs

Fully-pipelined efficient architectures for FPGA realization of discrete Hadamard transform

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options