The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Convolutional neural networks (CNNs) are revolutionizing machine learning, but they present significant computational challenges. Recently, many FPGA-based accelerators have been proposed to improve the performance and efficiency of CNNs. Current approaches construct a single processor that computes the CNN layers one at a time; the processor is optimized to maximize the throughput at which the collection...
CPU-FPGA heterogeneous platforms offer a promising solution for high-performance and energy-efficient computing systems by providing specialized accelerators with post-silicon reconfigurability. To unleash the power of FPGA, however, the programmability gap has to be filled so that applications specified in high-level programming languages can be efficiently mapped and scheduled on FPGA. The above...
Stochastic gradient descent (SGD) is one of the most popular numerical algorithms used in machine learning and other domains. Since this is likely to continue for the foreseeable future, it is important to study techniques that can make it run fast on parallel hardware. In this paper, we provide the first analysis of a technique called BUCKWILD! that uses both asynchronous execution and low-precision...
In this work, we propose an efficient architecture for the hardware realization of deep neural networks on reconfigurable computing platforms like FPGA. The proposed neural network architecture employs only one single physical computing layer to perform the whole computational fabric of fully-connected feedforward deep neural networks with customizable number of layers, number of neurons per layer...
Enabled by modern languages and retargetable compilers, software development is in a virtual “Cambrian explosion” driven by a critical mass of powerfully parameterized libraries; but hardware development practices lag far behind. We hypothesize that existing hardware construction languages (HCLs) and novel hardware compiler frameworks (HCFs) can put hardware development on a similar evolutionary path...
The end of Dennard scaling has led to an increase in demand for energy-efficient custom hardware accelerators, but current hardware design is slow and laborious, partly because each iteration of the compile-run-debug cycle can take hours or even days with existing simulation and emulation platforms. Cyclist is a new emulation platform designed specifically to shorten the total compile-run-debug cycle...
With the fast increasingly use of image and video processing in many aspects, the requirements for high performance and high-quality systems lead to the use of reconfigurable computing to accelerate traditional image processing platforms. In this work, an efficient runtime adaptable floating-point Gaussian filtering core is proposed to achieve not only high performance and quality but also kernel...
Image compression is of a great importance in multimedia system applications because it drastically reduces bandwidth for transmission and memory storage. Image compression algorithm, like JPEG2000, utilizes the Forward Discrete Wavelet Transform (FDWT) and Inverse Discrete Wavelet Transform (IDWT). The main problems face by researchers in the hardware implementation of the FDWT/IDWT are storage memory,...
Many-core systems are increasingly popular in embedded systems due to their high-performance and flexibility to execute different workloads. These many-core systems provide a rich processing fabric but lack the flexibility to accelerate critical operations with dedicated hardware cores. Modern Field Programmable Gate-Arrays (FPGAs) evolved to more than reconfigurable devices, providing embedded hard-core...
Support Vector Machines (SVMs) are supervised learning models of the machine learning field whose performance strongly depended on its hyperparameters. The Bio-inspired Optimization Tool for SVM (BIOTS) tool is based on a Multi-Objective Particle Swarm Algorithm (MOPSO) to tune hyperparameters of SVMs. In this work, BIOTS is proposed along with a custom hardware design generator (VHDL) that implements...
This work presents a hardware implementation of the morphological reconstruction algorithm for biomedical images analysis. The morphological reconstruction algorithm is based on the Sequential Reconstruction (SR). In this case. a hardware architecture has been developed and implemented by mapping the SR algorithm into an Altera Cyclone IV E FPGA based platform. including a NIOS II processor. The developed...
Pedestrian detection is one of the most challenging and vital tasks of driver assistance systems (DAS). Among several algorithms developed for human detection, histogram of oriented gradients (HOG) followed by support vector machine (SVM) has shown the most promising results. This paper presents a hardware accelerator for real-time pedestrian detection at different scales to fulfill the real-time...
We present a miniaturized universal hardware module for acoustic pattern recognition in various types of multichannel sensor signals. The module implements configurable signal analysis (signal transforms, filter banks, statistical transforms) and a GMM-HMM recognizer. The main hardware components are a XC7A75T FPGA performing almost all the computations, a TMS320C6746 digital signal processor organizing...
Different neural network models have been proposed to design efficient associative memories like Hopfield networks, Boltzmann machines or Cogent confabulation. Compared to the classical models, Encoded Neural Network (ENN) is a recently introduced formalism with a proven higher efficiency. This model has been improved through different contributions like Clone-based ENN (CbNNs) or Sparse ENNs (S-ENNs)...
Universal Filtered Orthogonal Frequency Division Multiplexing (UF-OFDM) is considered one of the main wave-form candidates to overcome the challenges facing the next generation of mobile communication systems. Due to its spectral properties it can support relaxed synchronization, low-latency communications and flexible time transmission interval. Nevertheless, the available recent literature addresses...
The aim of this paper is to present the implementation of an Embedded Real-Time Simulator (ERTS) for Modular Multilevel Converters (MMCs), using low-cost System-on-Chip (SoC) platform. In order to achieve new functionalities such as sensor-less control, monitoring, diagnostic and fault detection, the MMC plant model can be implemented along with the controller. In MMC applications, the implementation...
In recent years there has been a growing interest in Internet of Thing, Big Data and Mobile Internet. With the rapid growth of the amount of data in the embedded environment, using a traditional embedded processor is hard to satisfy the requirements of big data processing. Sorting is one of the fundamental operation in data processing and is also frequently used for search, filter, feature analysis...
Recent technological advances allowed the word to construct several wearable products that can capture and process the human body bio-signals. The PPG signal becomes one of the most contenders in heart rate monitoring due to their prominent features, flexibility, effectiveness and low costs. This paper present a novel System of PPG Heart rate calculation based on FPGA, using the Pan and Tompkins as...
This paper reports on the Field-Programmable Gate Array (FPGA) real-time implementation of a vector control scheme, by means of hardware-in-the-loop simulation. This approach will be applied for a PMSM used to propel an electric scooter, preceding its integration in a more complex experimental setup. The emerging need for powerful, flexible system-on-a-chip (SoC) platforms for developing complex drive...
Data compression technology is the necessary technology in the age of big data. Compared with software compression techniques, hardware compression techniques can improve speed and reduce power consumption. LZMA is a lossless compression technology, and its hardware implementation has broad application prospects. This paper proposes a novel high-performance implementation of the LZMA compression algorithm...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.