The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we present an FPGA hardware implementation approach for a phylogenetic tree reconstruction with maximum parsimony algorithm. The algorithm, based on stochastic local search, uses the Indirect Calculation of Tree Lengths and the Incremental Tree Optimization methods. We evaluate and compare our new approach against previous hardware approaches, and against TNT, the fastest available...
In this paper we present a framework for the seamlessly utilization of hardware accelerators in heterogeneous SoCs that are used to speedup the processing of Spark data analytics applications.
Reinforcement Learning (RL) is an area of machine learning in which an agent interacts with the environment by making sequential decisions. The agent receives reward from the environment to find an optimal policy that maximises the reward. Trust Region Policy Optimisation (TRPO) is a recent policy optimisation algorithm that achieves superior results in various RL benchmarks, but is computationally...
Relational databases provide a wealth of functionality to a wide range of applications. Yet, there are tasks for which they are less than optimal, for instance when processing becomes more complex (e.g., regular expression evaluation, data analytics) or the data is less structured (e.g., text or long strings). With the increasing amount of user-generated data stored in relational databases, there...
Today, artificial neural networks (ANNs) are widely used in a variety of applications, including speech recognition, face detection, disease diagnosis, etc. And as the emerging field of ANNs, Long Short-Term Memory (LSTM) is a recurrent neural network (RNN) which contains complex computational logic. To achieve high accuracy, researchers always build large-scale LSTM networks which are time-consuming...
Intel®'s Xeon® processor with integrated FPGA is a new research platform that provides all the capabilities of a Broadwell Xeon Processor with the added functionality of an Arria 10 FPGA in the same package. In this paper, we present an implementation on this platform to showcase the abilities and effectiveness of utilizing both hardware architectures to accelerate a convolutional based neural network...
Modern computer architectures have an ever-increasing demand for performance, but are constrained in power dissipation and chip area. To tackle these demands, architectures with application-specific accelerators have gained traction in research and industry. While this is a very promising direction, hard-wired accelerators fall short when too many applications need to be supported or flexibility is...
In this paper we propose a novel CNN hardware accelerator, called AlScale, capable of accelerating convolutional, pooling, fully-connected and adding CNN layers. In contrast to most existing solutions, AIScale offers a complete solution to the full CNN acceleration. AIScale is designed as a coarse-grained reconfigurable architecture, which uses rapid, dynamic reconfiguration during the CNN layer processing...
Support Vector Machine (SVM) is a linear binary classifier that requires a kernel function to handle non-linear problems. Most previous SVM implementations for embedded systems in literature were built targeting a certain application; where analyses were done through comparison with software implementations only. The impact of different application datasets towards SVM hardware performance were not...
One of the primary challenges associated with network functions virtualization (NFV) is the automated management of the service lifecycle. In this paper, we present a full software-based management and orchestration (MANO) stack which operates with OpenStack and OpenDaylight controllers and has the in-built functionality to automate the key phases of the NFV service lifecycle, namely resource discovery...
Computer architecture today is anything but business as usual, and what is bad for business is often great for science. As Moore's Law continues to unwaveringly march forward, despite the ceasing of Dennard scaling, continued performance gains with each processor generation has become a significant challenge, and requires creative solutions. Namely, the way to continue to scale performance in light...
Network security and monitoring devices use packet classification to match packet header fields in a set of rules. Many hardware architectures have been designed to accelerate packet classification and achieve wire-speed throughput for 100 Gbps networks. The architectures are designed for high throughput even for the shortest packets. However, FPGA SoC and Intel Xeon with FPGA have limited resources...
In this paper we discuss the potential of the integrated GPU to accelerate sorting by performing a partial sort prior to a comparison based CPU sort. We experiment along with several CPU comparison based sorting algorithms and outline the performance gain for a random input data set. We then analyze different x86 SoC architectures, and show that by sorting chunks stored inside the onchip GPU memory,...
The spectrometer is the most important back-end in single antenna radio astronomy observations. The state-of-the-art designs for this type of instruments propose to reduce the effects of spectral leakage by using the Polyphase Filter Bank (PFB) technique and to achieve wideband and high resolution by using digital, reconfigurable, and high-performance computing hardware, such as commercial-available...
The existing AR indoor registration technologies based on hardware often have the disadvantage of low registration accuracy. To solve the problem, a new indoor AR registration technology based on iBeacon is proposed in this paper. Firstly, the coordinates of the phone are calculated based on the data received by iBeacons. Secondly, the 3D directions of the phone are obtained based on the acceleration...
Our society relies upon information processing at a scale never seen before in human history. We are indeed experiencing an exponential growth in processing demand, as more and more applications in the most disparate domains emerge. While continuous improvements in the manufacturing processes of microprocessors has been able so far to mitigate the ecological and economical costs this trend imposes,...
The Guided Filter (GF) is well-known for its linear complexity. However, when filtering an image with an n-channel guidance, GF needs to invert an n × n matrix for each pixel. To the best of our knowledge existing matrix inverse algorithms are inefficient on current hardwares. This shortcoming limits applications of multichannel guidance in computation intensive system such as multi-label...
Cooperation of software and hardware with hybrid architectures, such as Xilinx Zynq SoC combining ARM CPU and FPGA fabric, is a high-performance and low-power platform for accelerating RSA Algorithm. This paper adopts the none-subtraction Montgomery algorithm and the Chinese Remainder Theorem (CRT) to implement high-speed RSA processors, and deploys a 48-node cluster infrastructure based on Zynq SoC...
Sparsity helps reducing the computation complexity of DNNs by skipping the multiplication with zeros. The granularity of sparsity affects the efficiency of hardware architecture and the prediction accuracy. In this paper we quantitatively measure the accuracy-sparsity relationship with different granularity. Coarse-grained sparsity brings more regular sparsity pattern, making it easier for hardware...
The ability to recognize physical activity, such as sedentary, driving, riding, daily activities and effective training, is useful for health conscious users to catalogue their daily activities and to develop good exercise routines. Conventional activity recognition algorithms require complex calculations, which are not suitable for wearable devices developed on low-cost, low-power hardware platforms...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.