The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Design productivity is a major concern preventing the mainstream adoption of FPGAs. Overlay architectures have emerged as one possible solution to this challenge, offering fast compilation and software-like programmability. However, overlays typically suffer from area and performance overheads due to limited consideration for the underlying FPGA architecture. These overlays have often been of limited...
Modern applications including graphics, multimedia, web search, and data analytics not only can benefit from acceleration, but also exhibit significant degrees of tolerance to imprecise computation. This amenability to approximation provides an opportunity to trade quality of the results for higher performance and better resource utilization. Exploiting this opportunity is particularly important for...
The use of heterogeneous computing resources, such as Graphic Processing Units or other specialized coprocessors, has become widespread in recent years because of their performance and energy efficiency advantages. Approaches for managing and scheduling tasks to heterogeneous resources are still subject to research. Although queuing systems have recently been extended to support accelerator resources,...
The stringent power constraints of complex microcontroller based devices (e.g. smart sensors for the IoT) represent an obstacle to the introduction of sophisticated functionality. Programmable accelerators would be extremely beneficial to provide the flexibility and energy efficiency required by fast-evolving IoT applications; however, the integration complexity and sub-10mW power budgets have been...
Virtual Instrument is a combination of hardware and software that allows the emulation of an instrument through a custom virtual console and a graphical user interface. A virtual instrument consists of a PC equipped with powerful application software, cost-effective hardware such as plug-in boards, which together perform the functions of traditional instruments. In a virtual instrument, it is the...
The Fast Fourier Transform (FFT) is an important algorithm in the fields of science and engineering, where it is used in diverse areas such as communications, signal processing, instrumentation, image and video analysis, etc. The algorithm is essentially a fast implementation of the Discrete Fourier Transform which allows it to reduce the asymptotic complexity of the latter from O(n2) to the former's...
Nowadays, many industrial synchronization systems rely on the Precise Time Protocol (PTP or IEEE1588) that provides sub-microsecond precision time transfer. However, there are some applications such as next generation of telecommunication systems (LTE-A & 5G) or scientific infrastructures that have stricter timing requirements that must guarantee the timing service regardless of traffic load conditions...
Image processing is a major aspect in transmission of data in compact fashion without loss of information. There are several algorithms defined for image compression, edge detection and noise reduction which form image transformation techniques. The proposed paper focuses on edge detection of given image using kernel matrix using sliding window algorithm. The interface system includes FPGA and beagle...
This paper bring a description of ‘HSCoT’, an efficient high level synthesis tool generating register transfer level (RTL) specifications for applications written entirely in C language and an associate reliable approach for speeding applications execution. It's based on dependency data flow graph construction and aims to explore maximally the inherent intrinsic parallelism of application. Application...
Trilateral filtering presents an edge preserving smoothing filter. The predecessor of Trilateral filtering, the bilateral filter is a non-linear filtering technique that can reduce noise from an image while preserving the strong and sharp edges, but it cannot provide desired result when the edges have valley or ridge like features. The Trilateral filter is extended to be a gradient-preserving filter,...
With the recent advancement of multilayer convolutional neural networks (CNN), deep learning has achieved amazing success in many areas, especially in visual content understanding and classification. To improve the performance and energy-efficiency of the computation-demanding CNN, the FPGA-based acceleration emerges as one of the most attractive alternatives. In this paper we design and implement...
This paper describes a general framework for transforming a sequential program into a network of processes, which are then converted to hardware accelerators through high level synthesis. Also proposed is a complementing technique for performing static deadlock analysis of the generated accelerator network. The interactions between the accelerators' schedules, the capacity of the communication channels...
Deep Convolutional Neural Networks (DCNN) have proven to be very effective in many pattern recognition applications, such as image classification and speech recognition. Due to their computational complexity, DCNNs demand implementations that utilize custom hardware accelerators to meet performance and energy-efficiency constraints. In this paper we propose an FPGA-based accelerator architecture which...
Scale Invariant Feature Transform is a competent algorithm for extracting unique features from images. The fact that features extracted are invariant to image scaling, translation, rotation and partially invariant to illumination changes makes it attractive in many computer vision applications involving mobile robots such as obstacle recognition, dynamic obstacle motion estimation, generating topological...
Accurate forecasts of future climate with numerical models of atmosphere and ocean are of vital importance. However, forecast quality is often limited by the available computational power. This paper investigates the acceleration of a C-grid shallow water model through the use of reduced precision targeting FPGA technology. Using a double-gyre scenario, we show that the mantissa length of variables...
The use of FPGAs as compute accelerators has been demonstrated by numerous researchers as an effective solution to meet the performance requirement across many application domains. However, the design productivity of developing FPGA accelerators remains much lower compared to the use of a typical software development flow. Although the use of the high-level design tools may partly alleviate this shortcoming,...
We present preliminary results with the TyTra design flow. Our aim is to create a parallelising compiler for high-performance scientific code on heterogeneous platforms, with a focus on Field-Programmable Gate Arrays (FPGAs). Using the functional language Idris, we show how this programming paradigm facilitates generation of different correctby- construction program variants through type transformations...
FPGA vendors now include hardened IPs to form a system-on-chip (SoC) making it easier to build embedded systems. However programming and integrating hardware accelerators (devices) into these systems present a challenge. The OpenCL standard has become accepted as a good programming model for managing devices, or hardware accelerators in the context of embedded systems on FPGAs, due to its rich set...
In this paper, we study the design and implementation of a reconfigurable architecture for graph processing algorithms. The architecture uses a message-passing model targeting shared-memory multi-FPGA platforms. We take advantage of our architecture to showcase a parallel implementation of the all-pairs shortest path algorithm (APSP) for unweighted directed graphs. Our APSP implementation adopts a...
The Square Kilometre Array (SKA), currently in the pre-construction phase, will be the world largest telescope array for radio astronomy. The Fourier domain acceleration search (FDAS) is a sub-module of the Non-imaging Processing Pulsar Search Sub-element (NIP PSS) of SKA-MID Central Signal Processor (CSP) element. The total performance needed for FDAS module of up to 2000 beams is over 14Poperations/s...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.