Search results

chapter

Aggressive pipelining of irregular applications on reconfigurable hardware

Zhaoshi Li, Leibo Liu, Yangdong Deng, Shouyi Yin, more

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) > 575 - 586

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA)

CPU-FPGA heterogeneous platforms offer a promising solution for high-performance and energy-efficient computing systems by providing specialized accelerators with post-silicon reconfigurability. To unleash the power of FPGA, however, the programmability gap has to be filled so that applications specified in high-level programming languages can be efficiently mapped and scheduled on FPGA. The above...

chapter

An efficient runtime adaptable floating-point Gaussian filtering core

Cuong Pham-Quoc, Tran Ngoc Thinh

2017 4th NAFOSTED Conference on Information and Computer Science > 183 - 188

2017 4th NAFOSTED Conference on Information and Computer Science

With the fast increasingly use of image and video processing in many aspects, the requirements for high performance and high-quality systems lead to the use of reconfigurable computing to accelerate traditional image processing platforms. In this work, an efficient runtime adaptable floating-point Gaussian filtering core is proposed to achieve not only high performance and quality but also kernel...

chapter

A SVM optimization tool and FPGA system architecture applied to NMPC

Carlos Eduardo Santos, Renato Coral Sampaio, Helon Ayala, Leandro dos S. Coelho, more

2017 30th Symposium on Integrated Circuits and Systems Design (SBCCI) > 96 - 102

2017 30th Symposium on Integrated Circuits and Systems Design (SBCCI)

Support Vector Machines (SVMs) are supervised learning models of the machine learning field whose performance strongly depended on its hyperparameters. The Bio-inspired Optimization Tool for SVM (BIOTS) tool is based on a Multi-Objective Particle Swarm Algorithm (MOPSO) to tune hyperparameters of SVMs. In this work, BIOTS is proposed along with a custom hardware design generator (VHDL) that implements...

chapter

System-level design for human action recognition in 3D scenes

Amin Safaei, Q. M. Jonathan Wu

2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC) > 548 - 553

2017 IEEE International Conference on Systems, Man and Cybernetics (SMC)

This study proposes a system-on-a-chip, field-programmable gate array (FPGA)-based real-time video processing platform for human action recognition. We provide the details of a hardware implementation for real-time human activity recognition in 3D scenes, including capture, processing, and display. The proposed platform is implemented by adding a two-stage preprocessing step to improve the results...

chapter

Hardware accelerator for boosting convolution computation in image classification applications

Meng-Chou Chang, Ze-Gang Pan, Jyun-Liang Chen

2017 IEEE 6th Global Conference on Consumer Electronics (GCCE) > 1 - 2

2017 IEEE 6th Global Conference on Consumer Electronics (GCCE)

In a convolutional neural network (CNN), convolution calculation can account for about 90% of the total processing work. This paper presents the design of a convolution hardware accelerator (CHA) which can support efficient matrix multiplication to speed up the convolution calculation. In our experiment, when a RISC-V Rocket processor is used to simulate the operation of a CNN for image classification,...

chapter

Evaluating irregular memory access on OpenCL FPGA platforms: A case study with XSBench

Yingyi Luo, Xianshan Wen, Kazutomo Yoshii, Seda Ogrenci-Memik, more

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

FPGAs are becoming an attractive choice as a heterogeneous computing unit for scientific computing because FPGA vendors are adding floating-point-optimized architectures to their product lines. Additionally, high-level synthesis (HLS) tools such as Altera OpenCL SDK are emerging, which could potentially break the FPGA programming wall and provide a streamlined flow for domain experts in scientific...

chapter

F-C3D: FPGA-based 3-dimensional convolutional neural network

Hongxiang Fan, Xinyu Niu, Qiang Liu, Wayne Luk

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

In recent years, 3-dimension convolutional neural networks (3D CNNs) have been widely used for video analysis, 3-dimension geometric data and medical image diagnosis. While conventional CNNs are computationally intensive, 3D CNNs push the computational requirements into another level, since each computation depends on multiple image frames. This paper describes a novel hardware architecture for a...

chapter

A programming model and runtime system for approximation-aware heterogeneous computing

Ioannis Parnassos, Nikolaos Bellas, Nikolaos Katsaros, Nikolaos Patsiatzis, more

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Heterogeneous platforms that include diverse architectures such as multicore CPUs, FPGAs and GPUs are becoming very popular due to their superior performance and energy efficiency. Besides heterogeneity, a promising approach for minimizing energy consumption is through approximate computing which relaxes the requirement that all parts of a program are considered equally important to the output quality,...

chapter

Evaluating high-level design strategies on FPGAs for high-performance computing

Artur Podobas, Hamid Reza Zohouri, Naoya Maruyama, Satoshi Matsuoka

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Field-Programmable Gate Arrays (FPGAs) are gaining considerable momentum in mainstream high-performance systems in recent years due to their flexibility and low power consumption. Still, FPGAs remain largely unavailable to software programmers due to programming and debugging difficulties that are inherent to standard Hardware Description Languages. The performance that hardware-oblivious software...

chapter

Flexible FPGA design for FDTD using OpenCL

Tobias Kenter, Jens Forstner, Christian Plessl

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 7

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Compared to classical HDL designs, generating FPGA with high-level synthesis from an OpenCL specification promises easier exploration of different design alternatives and, through ready-to-use infrastructure and common abstractions for host and memory interfaces, easier portability between different FPGA families. In this work, we evaluate the extent of this promise. To this end, we present a parameterized...

chapter

Accelerating low bit-width convolutional neural networks with embedded FPGA

Li Jiao, Cheng Luo, Wei Cao, Xuegong Zhou, more

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Convolutional Neural Networks (CNNs) can achieve high classification accuracy while they require complex computation. Binarized Neural Networks (BNNs) with binarized weights and activations can simplify computation but suffer from obvious accuracy loss. In this paper, low bit-width CNNs, BNNs and standard CNNs are compared to show that low bit-width CNNs is better suited for embedded systems. An architecture...

chapter

Performance of Large-Scale Electronic Structure Calculations on Built-in FPGA Systems

Seungmin Lee, Dukyun Nam, Hoon Ryu

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 635 - 636

2017 IEEE International Conference on Cluster Computing (CLUSTER)

We discuss the feasibility of an in-house Schrödinger equation solver on the Intel Broadwell Xeon processor with a built-in FPGA, with a particular focus on the performance of large-scale sparse matrix-vector multiplication (SpMV) that is the core numerical operation of electronic structure simulations for multi-million atomic systems. The double-precision SpMV section in our solver is offloaded to...

chapter

VineTalk: Simplifying software access and sharing of FPGAs in datacenters

Stelios Mavridis, Manolis Pavlidakis, Ioannis Stamoulias, Christos Kozanitis, more

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

FPGA-based accelerators are becoming first class citizens in data centers. Adding FPGAs in data centers can lead to higher compute densities with improved energy efficiency for latency critical workloads, such as financial applications. However FPGA deployment in datacenters brings difficulties both to application developers, and cloud providers. Application writers need to deal with the interfacing...

chapter

FISH: Linux system calls for FPGA accelerators

Kevin Nam, Blair Fort, Stephen Brown

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

This, paper presents the FISH (FPGA-Initiated Software-Handled) framework which allows FPGA accelerators to make system calls to the Linux operating system in CPU-FPGA systems. A special FISH Linux kernel module running on the CPU provides a system call interface for FPGA accelerators, much like the ABI which exists for software programs. We provide a proof-of-concept implementation of this framework...

chapter

Exploration of OpenCL for FPGAs using SDAccel and comparison to GPUs and multicore CPUs

Lester Kalms, Diana Gohringer

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Due to energy efficiency, heterogeneous computing is gaining more and more attention. Since FPGA implementations are time consuming, high-level synthesis (HLS) is used to close the productivity gap. OpenCL has become accepted as a good programming model for HLS, due to its portability, good capability of design verification and rich instruction set. This work implements different optimization strategies...

chapter

Evaluating high-level design strategies on FPGAs for high-performance computing

Artur Podobas, Hamid Reza Zohouri, Naoya Maruyama, Satoshi Matsuoka

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Field-Programmable Gate Arrays (FPGAs) are gaining considerable momentum in mainstream high-performance systems in recent years due to their flexibility and low power consumption. Still, FPGAs remain largely unavailable to software programmers due to programming and debugging difficulties that are inherent to standard Hardware Description Languages. The performance that hardware-oblivious software...

chapter

PolyPC: Polymorphic parallel computing framework on embedded reconfigurable system

Hongyuan Ding, Miaoqing Huang

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 8

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

With the help of parallelism provided by the fine-grained architecture, hardware accelerators on Field Programmable Gate Arrays (FPGAs) can significantly improve the performance of many applications. However, designers are typically required to have excellent hardware programming skills and unique optimization techniques to fully explore the potential of FPGA resources. In this work, we propose the...

chapter

A fully connected layer elimination for a binarizec convolutional neural network on an FPGA

Hiroki Nakahara, Tomoya Fujii, Shimpei Sato

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

A pre-trained convolutional deep neural network (CNN) is widely used for embedded systems, which requires highly power-and-area efficiency. In that case, the CPU is too slow, the embedded GPU dissipates much power, and the ASIC cannot keep up with the rapidly progress of the CNN variations. This paper uses a binarized CNN which treats only binary 2-values for the inputs and the weights. Since the...

chapter

Modified distributed arithmetic based low complexity CNN architecture design methodology

Madhuri Panwar, J. Padmini, Venkatasubrahmanian, Amit Acharyya, more

2017 European Conference on Circuit Theory and Design (ECCTD) > 1 - 4

2017 European Conference on Circuit Theory and Design (ECCTD)

CNN involves large number of convolution of feature maps and kernels, necessary for extracting useful features for accurate classification. However, it requires significant amount of computationally intensive power and area hungry multiplications limiting its deployment on embedded devices under resource constrained scenario. To address this problem, we propose modified distributed arithmetic based...

chapter

Application of convolutional neural networks on Intel® Xeon® processor with integrated FPGA

Philip Colangelo, Enno Luebbers, Randy Huang, Martin Margala, more

2017 IEEE High Performance Extreme Computing Conference (HPEC) > 1 - 7

2017 IEEE High Performance Extreme Computing Conference (HPEC)

Intel®'s Xeon® processor with integrated FPGA is a new research platform that provides all the capabilities of a Broadwell Xeon Processor with the added functionality of an Arria 10 FPGA in the same package. In this paper, we present an implementation on this platform to showcase the abilities and effectiveness of utilizing both hardware architectures to accelerate a convolutional based neural network...

INFONA - science communication portal

Search results

Aggressive pipelining of irregular applications on reconfigurable hardware

An efficient runtime adaptable floating-point Gaussian filtering core

A SVM optimization tool and FPGA system architecture applied to NMPC

System-level design for human action recognition in 3D scenes

Hardware accelerator for boosting convolution computation in image classification applications

Evaluating irregular memory access on OpenCL FPGA platforms: A case study with XSBench

F-C3D: FPGA-based 3-dimensional convolutional neural network

A programming model and runtime system for approximation-aware heterogeneous computing

Evaluating high-level design strategies on FPGAs for high-performance computing

Flexible FPGA design for FDTD using OpenCL

Accelerating low bit-width convolutional neural networks with embedded FPGA

Performance of Large-Scale Electronic Structure Calculations on Built-in FPGA Systems

VineTalk: Simplifying software access and sharing of FPGAs in datacenters

FISH: Linux system calls for FPGA accelerators

Exploration of OpenCL for FPGAs using SDAccel and comparison to GPUs and multicore CPUs

Evaluating high-level design strategies on FPGAs for high-performance computing

PolyPC: Polymorphic parallel computing framework on embedded reconfigurable system

A fully connected layer elimination for a binarizec convolutional neural network on an FPGA

Modified distributed arithmetic based low complexity CNN architecture design methodology

Application of convolutional neural networks on Intel® Xeon® processor with integrated FPGA

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options