The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Modular polynomial multiplication is the most computationally intensive operation in many homomorphic encryption schemes. In order to accelerate homomorphic computations, we propose a software/hardware (SW/HW) co-designed accelerator integrating fast software algorithms with a configurable hardware polynomial multiplier. The hardware accelerator is implemented through a High-Level Synthesis (HLS)...
We discuss techniques by which C descriptions corresponding to given implementation designs can be automatically reproduced. Once they are generated, they are compared with the original C designs to make sure that the implementation designs are correct, i.e., they are equivalent to the original C designs. In the cases where ECO (Engineering Change Order) is applied to the implementation designs, the...
In this paper, we describe our Trax player implemented on FPGA for FPT2016 design competition, which is designed by using a high-level synthesis tool, Xilinx Vivado HLS. The previous version of our design had recursive functions in the AI part, which could not be synthesized by Vivado HLS. They were processed by software using a hard IP processor. We rewrite them to non-recursive functions to synthesize...
The development of SLAM algorithms in the era of autonomous navigation and the growing demand for autonomous robot in place of human being, has put into question how to reduce the computational complexity and make use of these algorithms to operate in real time. Our work aims to take advantage of the high level synthesis (HLS) on FPGAs to design a real time SLAM application. Precisely, we evaluate...
Future technologies predict major reliability concern for digital systems due to growing impact of radiation based transient faults. Radiation strikes may produce upsets that last over several clock cycles and that can affect multiple functional units similarly (equivalently). This will be a problem in future as with the evolution of technology, the device geometry continues to shrink massively along...
Convolutional Neural Networks (CNNs) are a particular type of Artificial Neural Networks (ANNs) inspired by cells in the primary visual cortex of animals, and represent the state of the art in image recognition and classification. Nowadays, such supervised learning technique is very popular in Big Data analytics. In this context, due to the huge amount of data to be processed, it is crucial to find...
High-Level Synthesis (HLS) tools have been developed to increase the abstraction level of hardware design process, by using models like high-level programming languages (e.g. C/C++), Domain Specific Languages and Graphs. However, despite their advances in the last decade, the available HLS tools still require from the designer a broad hardware knowledge, which prevents a bigger reduction in the design...
Field Programmable Gate Arrays (FPGAs) have been extensively used in accelerating applications in many digital domains, examples include image and signal processing. These applications have been abundantly tested in high level languages like C, C++ and Matlab programming. Many standard libraries exist for image processing applications like OpenCV for end to end solutions. Applications centered around...
In this paper, we describe our implementation of Trax player on Xilinx Zynq programmable SoC for ICFPT2015 design competition. Our design uses Alpha-Beta pruning algorithm to find the best next move from the game tree. As thinking time for one turn is limited to one second by the competition rule, we also use iterative deepening algorithm and timer hardware. Evaluation function which is used for the...
Image scaling is a fundamental algorithm used in a large range of digital image applications. In this paper, we propose an efficient VLSI architecture for a novel edge-directed linear interpolation algorithm. Our VLSI design is implemented using high level synthesis (HLS) tool, which generates RTL modules from C/C++ functions. HLS provides significantly improved design productivity compared to the...
“It worked perfectly on the demo programs, but just doesn't work for slightly complicated code”. One of our researchers commented this on his previous experience with High Level Synthesis (HLS). On the other hand, when I first preached to one of our hardware engineers to use HLS, he was immediately sold after I told him that HLS can perform automatic pipelining. After decades of academic and industrial...
Nowadays, the security of information exchange and communication such as authentication, confidentiality, and privacy are essentially important. Cryptographic algorithms are basic components of the security. Encryption algorithms are classified into various types such as block and stream ciphers. In constrained environments and embedded systems such as RFID, lightweight and low cost cryptographic...
Design teams are increasingly looking for design flows that can rapidly lead to high performance and low power implementation of DSP algorithms. Model-based design can satisfy this requirement, but it must be (1) coupled with efficient high-level synthesis support in order to provide good Quality of Results, and (2) controlled to derive the desired area/performance/throughput trade-off. We present...
We propose a framework that enables intensive computation on ultra-low power devices with discontinuous energy-harvesting supplies. We devise an optimization algorithm that efficiently partitions the applications into smaller computational steps during high-level synthesis. Our system finds low-overhead checkpoints that minimize recomputation cost due to power losses, then inserts the checkpoints...
The automatic generation of hardware implementations for a given algorithm is generally a difficult task, especially when data dependencies span across multiple iterations such as in iterative stencil loops (ISLs). In this paper, we introduce an automatic design flow to extract parallelism from an ISL algorithm and perform a design space exploration to identify its best FPGA hardware implementation,...
High level synthesis using C/C++ code of applications is rapidly gaining ground. However, support for calculations is restricted to elementary algebraic operations of addition, multiplication, subtraction and division. Support for transcendental functions is generally unavailable and is inefficient where available. Transcendental functions are an important part of high performance computing. A framework...
This work presents a high-level synthesis methodology that uses the abstract state machines (ASMs) formalism as an intermediate representation (IR). We perform scheduling and allocation on this IR, and generate synthesizable VHDL. We have the following advantages when using ASMs as an IR: 1) it allows the specification of both sequential and parallel computation, 2) it supports an extension of a clean...
FPGAs have been improved significantly in terms of performance and capacity over the last 20 years. The scale of FPGA based design also sparked off the demands for high-level synthesis to handle complicated applications. A well known intricate application is the FMM (Fast Multipole Method) algorithm of N-body problem, which is so complicated that it was not implemented on FPGA as reported in literature...
We present a high-level synthesis flow for mapping an algorithm description (in C) to a provably equivalent register transfer level (RTL) description of hardware. This flow uses an intermediate representation which is an orthogonal factorization of the program behavior into control, data and memory aspects, and is suitable for the description of large systems. We show that optimizations such as arbiter-less...
The paper presents a method for automatic RTL-interface synthesis for a given C++ function as well as for a given SystemC-interface. This task is very important in High-Level Synthesis design flow where design entry is usually done in some abstract language (e.g. C++). As a source high-level description targets different SoC architectures or protocols, so it is needed to generate relevant pin-level...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.