The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this brief, we introduce an architecture for accelerating convolution stages in convolutional neural networks (CNNs) implemented in embedded vision systems. The purpose of the architecture is to exploit the inherent parallelism in CNNs to reduce the required bandwidth, resource usage, and power consumption of highly computationally complex convolution operations as required by real-time embedded...
A bi-dimensional filter for high accuracy image processing is implemented by using a novel partitioning method. The method is based on a number theory theorem, which permits to reduce the complexity of the operation to that of an adder chain and also the amount of the coefficients stored in memory, improving the memory organization. To show the advantage of such method, we implemented a Floating Point...
In this paper, we propose a hardware computing architecture for face detection that classifies an image as a face or non-face. The computing architecture is first designed, modeled and tested in MATLAB Simulink using Xilinx block set and was later tested using a Virtex-6 FPGA ML605 Evaluation Kit. The system uses learned filters which were previously extracted by training on a set of face and non-face...
This work presents an efficient hardware accelerator design of deep residual learning algorithms, which have shown superior image recognition accuracy (>90% top-5 accuracy on ImageNet database). Two key objectives of the acceleration strategy are to (1) maximize resource utilization and minimize data movements, and (2) employ scalable and reusable computing primitives to optimize physical design...
Despite its popularity, deploying Convolutional Neural Networks (CNNs) on a portable system is still challenging due to large data volume, intensive computation and frequent memory access. Although previous FPGA acceleration schemes generated by high-level synthesis tools (i.e., HLS, OpenCL) have allowed for fast design optimization, hardware inefficiency still exists when allocating FPGA resources...
Aiming at the characteristics of SIFT (Scale Invariant Feature Transform) algorithm which has large amount of calculation and can be highly paralleled, we propose an optimized FPGA implementation so that it can be accelerated on hardware. In this method, we firstly simplify the process of filtering image and generating Gaussian pyramids through selecting appropriate parameters and hardware structure,...
Recently, the OpenCL hardware-software co-design methodology has gained traction in realizing effective parallel architecture designs in heterogeneous FPGA platforms. In fact, the portability of OpenCL on hardware ready platforms such as GPU or multicore CPU enables ease of design verification. This is true especially for parallel algorithms before implementing them using cumbersome HDL-based RTL...
We develop a new paradigm for designing fully streaming, area-efficient FPGA implementations of common building blocks for vision algorithm. By focusing on avoiding redundant computation we achieve a reduction of one to two orders of magnitude reduction in design area utilization as compared to previous implementations. We demonstrate that our design works in practice by building five 325 frames per...
Convolution is one of the most important operators used in image processing. With the constant need to increase the performance in high-end applications and the rise and popularity of parallel architectures, such as GPUs and the ones implemented in FPGAs, comes the necessity to compare these architectures in order to determine which of them performs better and in what scenario. In this article, convolution...
Edge of image is one of the most fundamental and significant features. Edge detection is always one of the classical studying projects of computer vision and image processing field. It is the first step of image analysis and understanding. With the continuous improvement of remote sensing image, especially the appearance of Digital Aerial Image, edge detection is necessary step to extract information...
Spike-based systems are neuro-inspired circuits implementations traditionally used for sensory systems or sensor signal processing. Address-Event-Representation (AER) is a neuromorphic communication protocol for transferring asynchronous events between VLSI spike-based chips. These neuro-inspired implementations allow developing complex, multilayer, multichip neuromorphic systems and have been used...
Field Programmable Gate Array (FPGA) is an effective device to realize real-time parallel processing of vast amounts of video data because of the fine-grain reconfigurable structures. This paper presents a kind of parallel processing construction of Sobel edge detection enhancement algorithm, which can quickly get the result of one pixel in only one clock periods. The algorithm is designed with a...
Image convolution operations in digital computer systems are usually very expensive operations in terms of resource consumption (processor resources and processing time) for an efficient Real-Time application. In these scenarios the visual information is divided into frames and each one has to be completely processed before the next frame arrives in order to warranty the real-time. A spike-based philosophy...
In this paper we present a scalable hardware architecture to implement large-scale convolutional neural networks and state-of-the-art multi-layered artificial vision systems. This system is fully digital and is a modular vision engine with the goal of performing real-time detection, recognition and segmentation of mega-pixel images. We present a performance comparison between a software, FPGA and...
We present a massively parallel coprocessor for accelerating Convolutional Neural Networks (CNNs), a class of important machine learning algorithms. The coprocessor functional units, consisting of parallel 2D convolution primitives and programmable units performing sub-sampling and non-linear functions specific to CNNs, implement a ldquometa-operatorrdquo to which a CNN may be compiled to. The coprocessor...
Window-based operations such as two dimensional (2-D) convolution operations are commonly used in image and video processing applications. In this paper, a new design technique that considers the neighboring pixels within the window to detect and eliminate redundant or unnecessary computations for power reduction is presented. A novel on-chip detection technique is developed for the proposed neighborhood...
In this paper, a design and implementation of an efficient, low power log-based 2D convolution unit (convolver) for video processing applications is proposed. The design of the proposed convolver utilizes approximation method with error correction technique to transform data to logarithmic domain for reduced power consumption. A novel design and implementation of a modular approach for leading bit...
This paper describes a hardware implementation of aerial image simulation in lithography using FPGA. However, such simulators are presently performed using mainly software-based techniques on dedicated computers. The Hopkins partially coherent imaging equation is decomposed numerically by using singular value decomposition (SVD). The data input is a function which is consisting of rectangles as Manhattan...
This paper present the implementations of Gabor filter for fingerprint recognition using Verilog HDL. This work demonstrates the application of Gabor filter technique to enhance the fingerprint image. The incoming signal in form of image pixel will be filter out or convolute by the Gabor filter to define the ridge and valley regions of fingerprint. This is done with the application of a real time...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.