Search results

chapter

Hardware Design Automation of Convolutional Neural Networks

Andrea Solazzo, Emanuele Del Sozzo, Irene De Rose, Matteo De Silvestri, more

2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) > 224 - 229

2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Convolutional Neural Networks (CNNs) are a variation of feed-forward Neural Networks inspired by the biological process in the visual cortex of animals. The interest in this supervised learning algorithm has rapidly grown in many fields like image and video recognition and natural language processing. Nowadays they have become the state of the art in various applications like mobile robot vision,...

chapter

Angel-Eye: A Complete Design Flow for Mapping CNN onto Customized Hardware

Kaiyuan Guo, Lingzhi Sui, Jiantao Qiu, Song Yao, more

2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) > 24 - 29

2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Convolutional Neural Network (CNN) has become a successful algorithm in the region of artificial intelligence and a strong candidate for many applications. However, for embedded platforms, CNN-based solutions are still too complex to be applied if only CPU is utilized for computation. Various dedicated hardware designs on FPGA and ASIC have been carried out to accelerate CNN, while few of them explore...

chapter

A nonlinear auto-regressive Volterra model based on FPGA

Bin Deng, Hongji Li, Fei Su, Jiang Wang, more

2016 12th World Congress on Intelligent Control and Automation (WCICA) > 962 - 966

2016 12th World Congress on Intelligent Control and Automation (WCICA)

As for nervous diseases study, building specific models tends to be really hard but significant. This paper introduces a novel method for neuron system identification using nonlinear auto-regressive Volterra (NARV) model based on the field programmable gate arrays (FPGA). We select HH model as a “black-box” system requiring identification, and obtain input and output data. A NARV model built on the...

chapter

A programmable and reconfigurable core for binary image processing

Ayad Dalloo, Alberto Garcia-Ortiz

2016 11th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC) > 1 - 6

2016 11th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC)

Binary-image processing cores are extremely useful in many image and video applications such as object recognition, tracking, motion detection, and identification. To address the variety of applications and binary-image kernels, we propose an FPGA-based intellectual property core with enhanced flexibility: it is programmable, reconfigurable, and parameterizable. The core performs single binary image...

chapter

Implementation of decoders for symmetric low density parity check codes on parallel computation platforms using OpenCL

Bruce F. Cockburn, Andrew J. Maier

2016 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE) > 1 - 6

2016 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)

OpenCL is a high-level language that allows mixed hardware/software systems to be specified and compiled to run on heterogeneous parallel computing platforms. The hardware parallelism can take the form of multi-core central processing units (CPUs), massively parallel graphics processing units (GPUs), and, most recently, field-programmable gate array (FPGA) fabrics. OpenCL compilers for CPUs and GPUs...

chapter

A Fast and Accurate Cost Model for FPGA Design Space Exploration in HPC Applications

Syed Waqar Nabi, Wim Vanderbauwhede

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 114 - 123

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Heterogeneous High-Performance Computing (HPC) platforms present a significant programming challenge, especially because the key users of HPC resources are scientists, not parallel programmers. We contend that compiler technology has to evolve to automatically create the best program variant by transforming a given original program. We have developed a novel methodology based on type transformations...

chapter

Low power Convolutional Neural Networks on a chip

Yu Wang, Lixue Xia, Tianqi Tang, Boxun Li, more

2016 IEEE International Symposium on Circuits and Systems (ISCAS) > 129 - 132

2016 IEEE International Symposium on Circuits and Systems (ISCAS)

Deep learning, and especially Convolutional Neural Network (CNN, is among the most powerful and widely used techniques in computer vision. Applications range from image classification to object detection, segmentation, Optical Character Recognition (OCR), etc. At the same time, CNNs are both computationally intensive and memory intensive, making them difficult to be deployed on low power lightweight...

chapter

Scheduler for Inhomogeneous and Irregular CGRAs with Support for Complex Control Flow

Tajas Ruschke, Lukas Johannes Jung, Dennis Wolf, Christian Hochberger

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 198 - 207

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Coarse Grained Reconfigurable Arrays (CGRA) are more area and energyefficient compared to FPGAs, if we consider applications that aredominated by arithmetical operations. Enabling the user to employCGRAs requires tools to create suitable CGRA instances and to programthem on a high abstraction level. In this contribution we brieflyexplain a CGRA archticture generator and we focus on the schedulerthat...

chapter

On the Automation of High Level Synthesis of Convolutional Neural Networks

Emanuele Del Sozzo, Andrea Solazzo, Antonio Miele, Marco D. Santambrogio

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 217 - 224

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Convolutional Neural Networks (CNNs) are a particular type of Artificial Neural Networks (ANNs) inspired by cells in the primary visual cortex of animals, and represent the state of the art in image recognition and classification. Nowadays, such supervised learning technique is very popular in Big Data analytics. In this context, due to the huge amount of data to be processed, it is crucial to find...

chapter

FPGA kernels for classification rule induction

P. Skoda, B. Medved Rogina

2016 39th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO) > 337 - 342

2016 39th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO)

Classification is one of the core tasks in machine learning data mining. One of several models of classification are classification rules, which use a set of if-then rules to describe a classification model. In this paper we present a set of FPGA-based compute kernels for accelerating classification rule induction. The kernels can be combined to perform specific procedures in rule induction process,...

chapter

DeCO: A DSP Block Based FPGA Accelerator Overlay with Low Overhead Interconnect

Abhishek Kumar Jain, Xiangwei Li, Pranjul Singhai, Douglas L. Maskell, more

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) > 1 - 8

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)

Coarse-grained FPGA overlay architectures paired with general purpose processors offer a number of advantages for general purpose hardware acceleration because of software-like programmability, fast compilation, application portability, and improved design productivity. However, the area overheads of these overlays, and in particular architectures with island-style interconnect, negate many of these...

chapter

When Spark Meets FPGAs: A Case Study for Next-Generation DNA Sequencing Acceleration

Yu-Ting Chen, Jason Cong, Zhenman Fang, Jie Lei, more

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) > 29

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)

FPGA-enabled datacenters have shown great potential for providing performance and energy efficiency improvement, and captured a great amount of attention from both academia and industry. In this paper we aim to answer one key question: how can we efficiently integrate FPGAs into state-of-the-art big-data computing frameworks? Although very important, this problem has not been well studied, especially...

chapter

On How to Improve FPGA-Based Systems Design Productivity via SDAccel

Giulia Guidi, Enrico Reggiani, Lorenzo Di Tucci, Gianluca Durelli, more

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 247 - 252

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Custom hardware accelerators are widely used to improve the performance of software applications in terms of execution times and to reduce energy consumption. However the realization of an hardware accelerator and its integration in the final system is a difficult and error prone task. For this reason, both Industry and Academy are continuously developing Computer Aided Design (CAD) tools to assist...

chapter

Vertex-Centric Graph Processing on FPGA

Nina Engelhardt, Hayden Kwok-Hay So

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) > 92

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)

Past research and implementation efforts have shown that FPGAs are efficient at processing many graph algorithms. However, they are notoriously hard to program, leading to impractically long development times even for simple applications. We propose a vertex-centric framework for graph processing on FPGAs, providing a base execution model and distributed architecture so that developers need only write...

chapter

Bridging the Performance-Programmability Gap for FPGAs via OpenCL: A Case Study with OpenDwarfs

Konstantinos Krommydas, Ahmed E. Helal, Anshuman Verma, Wu-Chun Feng

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) > 198

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)

For decades, the streaming architecture of FPGAs has delivered accelerated performance across many application domains, such as option pricing solvers in finance, computational fluid dynamics in oil and gas, and packet processing in network routers and firewalls. However, this performance has come at the significant expense of programmability, i.e., the performance-programmability gap. In particular,...

chapter

CS-Based Secured Big Data Processing on FPGA

Amey Kulkarni, Ali Jafari, Colin Shea, Tinoosh Mohsenin

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) > 201

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)

The four V's in Big data sets, Volume, Velocity, Variety, and Veracity, provides challenges in many different aspects of real-time systems. Out of these areas securing big data sets, reduction in processing time and communication bandwidth are of utmost importance. In this paper we adopt Compressive Sensing (CS) based framework to address all three issues. We implement compressive Sensing using Deterministic...

chapter

Energy Efficiency of Full Pipelining: A Case Study for Matrix Multiplication

Peipei Zhou, Hyunseok Park, Zhenman Fang, Jason Cong, more

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) > 172 - 175

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)

Customized pipeline designs that minimize the pipeline initiation interval (II) maximize the throughput of FPGA accelerators designed with high-level synthesis (HLS). What is the impact of minimizing II on energy efficiency? Using a matrix-multiply accelerator, we show that matrix multiplies with II>1 can sometimes reduce dynamic energy below II=1 due to interconnect savings, but II=1 always achieves...

chapter

Implementation of an OpenRISC based SoC and Linux Kernel installation on FPGA

Latif Akcay, Mehmet Tukel, Siddika Berna Ors

2016 24th Signal Processing and Communication Application Conference (SIU) > 1969 - 1972

2016 24th Signal Processing and Communication Application Conference (SIU)

Embedded system designs and applications have been more common today. The aim of this study is to tell how to implement open source OpenRISC based SoC's, which can be used for embedded system designs, on FPGA and how to install Linux Kernel. It makes OpenRISC based SoC's different from other Soc's due to it's modifiable open source codes for processors and all peripherals, and no license fee demand...

chapter

An in-kernel NOSQL cache for range queries using FPGA NIC

Korechika Tamura, Hiroki Matsutani

2016 International Conference on FPGA Reconfiguration for General-Purpose Computing (FPGA4GPC) > 13 - 18

2016 International Conference on FPGA Reconfiguration for General-Purpose Computing (FPGA4GPC)

To make use of big data, various NOSQL data stores have been deployed, such as key-value stores and column-oriented stores. NOSQL data stores typically achieve a high degree of scalability, while specialized for some specific purposes; thus, Polyglot persistence that employs multiple NOSQL data stores complementally is a practical choice toward a high diversity of application demands. We assume various...

chapter

Composable, parameterizable templates for high-level synthesis

Janarbek Matai, Dajung Lee, Alric Althoff, Ryan Kastner

2016 Design, Automation & Test in Europe Conference & Exhibition (DATE) > 744 - 749

2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)

High-level synthesis tools aim to make FPGA programming easier by raising the level of programming abstraction. Yet in order to get an efficient hardware design from HLS tools, the designer must know how to write HLS code that results in an efficient low level hardware architecture. Unfortunately, this requires substantial hardware knowledge, which limits wide adoption of HLS tools outside of hardware...

INFONA - science communication portal

Search results

Hardware Design Automation of Convolutional Neural Networks

Angel-Eye: A Complete Design Flow for Mapping CNN onto Customized Hardware

A nonlinear auto-regressive Volterra model based on FPGA

A programmable and reconfigurable core for binary image processing

Implementation of decoders for symmetric low density parity check codes on parallel computation platforms using OpenCL

A Fast and Accurate Cost Model for FPGA Design Space Exploration in HPC Applications

Low power Convolutional Neural Networks on a chip

Scheduler for Inhomogeneous and Irregular CGRAs with Support for Complex Control Flow

On the Automation of High Level Synthesis of Convolutional Neural Networks

FPGA kernels for classification rule induction

DeCO: A DSP Block Based FPGA Accelerator Overlay with Low Overhead Interconnect

When Spark Meets FPGAs: A Case Study for Next-Generation DNA Sequencing Acceleration

On How to Improve FPGA-Based Systems Design Productivity via SDAccel

Vertex-Centric Graph Processing on FPGA

Bridging the Performance-Programmability Gap for FPGAs via OpenCL: A Case Study with OpenDwarfs

CS-Based Secured Big Data Processing on FPGA

Energy Efficiency of Full Pipelining: A Case Study for Matrix Multiplication

Implementation of an OpenRISC based SoC and Linux Kernel installation on FPGA

An in-kernel NOSQL cache for range queries using FPGA NIC

Composable, parameterizable templates for high-level synthesis

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options