The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Convolutional neural networks (CNNs) are deployed in a wide range of image recognition, scene segmentation and object detection applications. Achieving state of the art accuracy in CNNs often results in large models and complex topologies that require significant compute resources to complete in a timely manner. Binarised neural networks (BNNs) have been proposed as an optimised variant of CNNs, which...
A modern system-on-a-chip includes tens to hundreds of modules such as processor cores, memories and other IP blocks and exchanges packetized data using highperformance interconnection networks as a subsystem for data transport. This paper reports the implementation of an industry-wide network on a chip in FPGA, and the first implementation and evaluation of the Sonics Performance Monitor and Hardware...
In this paper we extend and analyze Amdahl's law to general heterogeneous MPSoC era, to find out how the speedup is affected by the parameters, including amount and speedup for microprocessors and accelerators, as well as the task partition characteristics. We also analyze the theoretical results about how the extended Amdahl's Law is applied to leverage load balancing of a heterogeneous MPSoC without...
The design flow of Fast Fourier Transform devices development using the method of algorithmic operation devices synthesis from graphical representation of algorithms is proposed. Their automatic synthesis for various numbers of input data with different word length and their comparative evaluation are performed.
This paper presents an educational platform for digital system practices with a conventional PCI bus interface, based on reconfigurable hardware especially useful for the designing of hardware accelerators and systems with a PCI bus interface. The aim of the platform is to provide students with a single tool to develop rapid prototypes that covers all aspects involved in the study of digital systems...
An alternative hardware implementation of a P1619 XTS-AES architecture, compared to the typical that have been proposed by now, is presented in this paper. The implementation is based on the use of two AES cores for simultaneous Tweak value calculation and block encoding/decoding operations. The implementation is efficient for burst mode of operation, decreasing by 25% the required time, exploiting...
Standard processors have logical resources necessary for implementing various calculating platforms, capable to execute applications in different fields such as communication, command and control or signal processing. However, the sequential aspect of executing the instructions, the speed limit given by the access to the memory block and the standard architecture of the processors, dictate some of...
This paper describes an implementation in hardware of Internet Protocol version 4. Routing and addressing features were integrated with Network Interfaces and synthesized to a Stratix II FPGA device. Our work showed two implementations of a full duplex Internet Protocol version 4. The first implementation consists in a Reference design and the second uses the same design but with more buffer space...
Traditionally, digital signal processing (DSP) is performed using fixed-point or integer arithmetic. The algorithm is carefully mapped into a limited dynamic range, and scaled through each function in the datapath. This requires numerous rounding and saturation steps, and can adversely affect the algorithm performance. Use of floating-point arithmetic provides a large dynamic range and greatly simplifies...
New system-on-chip (SoC) design techniques are necessary to address the communication requirements for future SoC. The currently used bus-centered approach becomes an inappropriate choice because of its limitation as a shared medium that restricts the scalability of the communication architecture. Also, long bus wires result in performance degradation due to the increased capacitive load. The long...
In this paper we describe a methodology to do rapid hardware prototyping of a part of a digital signal processing system described in Simulink. It explains the main technical problems when trying to go to hardware from a pure functional description and the solutions proposed to solve them. The methodology is applied on a proven model, from the architecture co-simulation, to the real hardware implementation...
The fine-grained parallelism inherent in FPGAs has encouraged their use in packet processing systems. To facilitate debugging and performance evaluation, designers require on-chip monitors that provide abstractions of low-level details and a system-level perspective. In this paper, we present five architectures that permit transaction-based communication-centric monitoring of packet processing systems...
In this paper, we study and compare the performance of bus-based and mesh-based with spidernet NoC-based infrastructure in Alterapsilas FPGA. We first analysis the inner latency performance of the NoC infrastructure among routers, and we provide two modes to emulate the specific application on those infrastructures for the purpose of performance comparison. It is shown that NoC-based infrastructure...
Embedded system design is increasingly based on single chip multiprocessors because of the high performance and flexibility requirements. Contrary to desktop multi-core and usual multiprocessors, embedded multiprocessors are tightly constrained for the number of external DDR memories due to pin constraints which in turn may affect concurrency access for embedded parallel software implementation. In...
In this paper, the novel mechanical switch device: suspended-gate FET is applied to FPGA development. This device offers almost an ideal subthreshold swing and a hysteretic resistance switching, opening opportunities for low-power applications. The proposed device can be used as the building block of programmable elements and memory of an FPGA. Based on this device, the proposed FPGA architecture,...
The implementation of a recently proposed IP core of an efficient motion estimation co-processor is considered. Some significant functional improvements to the base architecture are proposed, as well as the presentation of a detailed description of the interfacing between the co-processor and the main processing unit of the video encoding system. Then, a performance analysis of two distinct implementations...
Embedded web servers have a growing presence in a wide range of fields related to consumer electronics and industrial applications. FPGAs are a valid alternative in the implementation of these systems adding additional advantages to the traditional architectures based on microprocessors or microcontrollers. In this paper we introduce two web server implementations on FPGA devices. The first uses an...
In this paper, results of a simulative performance evaluation of RISC-based SoC platforms for networking applications are presented. We use our SystemC simulation environment that is calibrated with a reference implementation on an FPGA-based prototyping environment, consisting of a single RISC-CPU, memory system, Ethernet MAC and an autonomous DMA engine. In order to achieve precise results, a real...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.