The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper explores hardware acceleration to significantly improve the runtime of computing the forward algorithm on Pair-HMM models, a crucial step in analyzing mutations in sequenced genomes. We describe 1) the design and evaluation of a novel accelerator architecture that can efficiently process real sequence data without performing wasteful work; and 2) aggressive memoization techniques that can...
Genetic sequence alignment has always been a computational challenge in bioinformatics. Depending on the problem size, software-based aligners can take multiple CPU-days to process the sequence data, creating a bottleneck point in bioinformatic analysis flow. Reconfigurable accelerator can achieve high performance for such computation by providing massive parallelism, but at the expense of programming...
Ultrasonic NDE uses high frequency acoustic waves to evaluate materials, and often signal processing is required to detect echoes from defects in the presence of microstructure scattering noise. Scattering noise, also known as clutter, interferes with the flaw signal and cannot be completely eliminated by using classical signal processing methods such as band-pass filtering. In this paper, neural...
Cooperation of software and hardware with hybrid architectures, such as Xilinx Zynq SoC combining ARM CPU and FPGA fabric, is a high-performance and low-power platform for accelerating RSA Algorithm. This paper adopts the none-subtraction Montgomery algorithm and the Chinese Remainder Theorem (CRT) to implement high-speed RSA processors, and deploys a 48-node cluster infrastructure based on Zynq SoC...
N-Body simulation simulates the evolution of a system that is composed of N particles, where each element receives a force that is due to the interaction with all the other elements within the system. Usually, the influence of external physical forces, such as gravity, is involved too. This methodology is widely used in different fields that range from astrophysics, where it is used to study the interaction...
To the present day, a multitude of studies aims to understand how the Central Nervous System (CNS) translates neural pulses to muscle motor tasks, through the analysis of surface EMG (sEMG) recordings. One of the most considerable methods applies the Non-Negative Matrix Factorization (NMF) to data recorded from sEMG electrodes, to extract coordinated motor patterns, the so-called muscle synergies,...
Deep neural networks have gained tremendous attention in both the academic and industrial communities due to their performance in many artificial intelligence applications, particularly in computer vision. However, these algorithms are known to be computationally very demanding for both scoring and model learning applications. State-of-the-art recognition models use tens of millions of parameters...
Protein folding is the physical process by which a sequence of amino acids in a protein folds into its tertiary structure, which determines the functionality of the protein. The knowledge of this structure is crucial for the development of new pharmaceutical therapies. For this reason, many drug industries are interested in applying these kind of algorithms. There are various methods to perform this...
Adaptive radiotherapy is a technique intended to increase the accuracy of radiotherapy. Currently, it is not clinically feasible due to the time required to process the images of patient anatomy. Hardware acceleration of image processing algorithms may allow them to be carried out in a clinically acceptable timeframe. This paper presents the experiences encountered using high-level synthesis tools...
Random sampling based path planning algorithms have shown their high efficiency in robotics, navigation and related fields. The Rapidly-Exploring Random Trees (RRT) is the typical method and works well in a variety of applications. Due to the sub-optimal issue of original RRT, the recent algorithm, known as RRT*, significantly improves the optimality of solution by adding the “cost review” procedure...
Despite its popularity, deploying Convolutional Neural Networks (CNNs) on a portable system is still challenging due to large data volume, intensive computation and frequent memory access. Although previous FPGA acceleration schemes generated by high-level synthesis tools (i.e., HLS, OpenCL) have allowed for fast design optimization, hardware inefficiency still exists when allocating FPGA resources...
Image features are broadly used in embedded computer vision applications, from object detection and tracking to motion estimation and 3D reconstruction. Efficient feature extraction and description are crucial due to the real-time requirements of such applications over a constant stream of input data. High-speed computation typically comes at the cost of high power dissipation, yet embedded systems...
String matching has become essential and widely applied in modern computer applications, especially with explosive data scale. As a classic fast and exact single pattern matching algorithm, Knuth-Morris-Pratt (KMP) algorithm has been demonstrated in network security and computational biology. However, with the increasing amount of data in the modern society, it becomes increasing important and essential...
Sorting represents one of the most important operations in data center applications. In this paper, we propose a hardware-software FPGA accelerated based solution for very large data set merge sorting. The accelerator is using a FIFO based approach for sorting. The main contributions of the proposed solution are: (i) configurable FIFO buffers in order to address the variable size of the pre-sorted...
Convolutional Neural Networks (CNNs) are a particular type of Artificial Neural Networks (ANNs) inspired by cells in the primary visual cortex of animals, and represent the state of the art in image recognition and classification. Nowadays, such supervised learning technique is very popular in Big Data analytics. In this context, due to the huge amount of data to be processed, it is crucial to find...
We propose a novel FPGA-accelerated BWA-MEM implementation, a popular tool for genomic data mapping. The performance and power-efficiency of the FPGA implementation on the single Xilinx Virtex-7 Alpha Data add-in card is compared against a software-only baseline system. By offloading the Seed Extension phase onto the FPGA, a two-fold speedup in overall application-level performance is achieved and...
In this project, we propose an SoC solution to accelerate the Pair-HMM's forward algorithm which is the key performance bottleneck in the GATK's HaplotypeCaller tool for DNA variant calling. We develop two versions of the Pair-HMM accelerator: one using High Level Synthesis (HLS), and another ring-based manual RTL implementation. We investigate the performance of the manual RTL design and HLS design...
One of the key challenges facing genomics today is efficiently storing the massive amounts of data generated by next-generation sequencing platforms. Reference-based compression is a popular strategy for reducing the size of genomic data, whereby sequence information is encoded as a mapping to a known reference sequence. Determining the mapping is a computationally intensive problem, and is the bottleneck...
Recent technology advances allow new generation reconfigurable-based embedded systems to contain a large number of cores and reconfigurable logic elements. Consequently, to take benefit of such very powerful hybrid reconfigurable MPSoC, designers need tools to explore the large design space of the possible configurations. In this paper, we develop a new hybrid bi-objective genetic and parallel variable...
Feature based vision applications rely on highly efficient extraction and analysis of features from images to reach satisfactory levels of performance and latency. In this paper, we describe the implementation of an algorithm that combines distributed feature detector (D-HCD) with a rotational invariant feature descriptor (R-HOG). Based on an algorithmic comparison with other feature detectors and...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.