The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The satisfiability (SAT) problem is to find an assignment of binary values to the variables which satisfy a given clausal normal form (CNF). Many practical application problems can be transformed to SAT problems, and many SAT solvers have been developed. SAT problem is, however, NP-complete and its computational cost is very high. In order to reduce the computational cost, preprocessors are widely...
Loop pipelining is a key transformation in high-level synthesis tools as it helps maximizing both computational throughput and hardware utilization. Nevertheless, it somewhat looses its efficiency when dealing with small trip-count inner loops, as the pipeline latency overhead quickly limits its efficiency. Even if it is possible to overcome this limitation by pipelining the execution of a whole loop...
The π-calculus is a process algebra originally designed for modelling communicating systems. In this work, it is applied to the design of schedules for partial dynamic reconfiguration, which denote when modules become active and which channels they use for communication. While the execution of the π-calculus in software is possible, a direct execution in hardware is desirable for two reasons: Firstly,...
In this paper, we describe a generic approach for integrating a dynamically reconfigurable device into a general purpose system interconnected with a high-speed link. The system can dynamically install and execute hardware instances of functions to accelerate parts of a given software code. The hardware descriptions of the functions (bitstreams) are inserted into the executable binary running on the...
This paper presents software and hardware co-design of an FPGA-based Connect6 solver with scalable streaming cores. The solver searches a game tree by using the miniMax algorithm with alpha-beta pruning. Since evaluation of board situations is the most time-consuming part, we adopted an approach to accelerate it with dedicated hardware while other parts are executed by software. We design a custom...
In recent years, object detection has been more frequently integrated with other vision processing functions, acting for acquisition of region of interest and is widely adopted in portable devices such as digital camera capable for automatic focusing on faces. In applications targeting those devices, limitations in both hardware resources and power supply mean an efficient utilization of hardware...
In this paper, we present a strategy and an FPGA implementation of a Connect6 player submitted to the FPT 2011 Design Competition. Connect6 is a two-player strategy board game. The winner of the game is the player who first gets six pieces of his color in a connected horizontal, vertical or diagonal line. We assign a strategic value to each potential move depending on the current board configuration...
FPGAs are an attractive platform for applications with high computation demand and low energy consumption requirements. However, design effort for FPGA implementations remains high — often an order of magnitude larger than design effort using high level languages. Instead of this time-consuming process, high level synthesis (HLS) tools generate hardware implementations from high level languages (HLL)...
Reconfigurable computers have started to appear in the HPC landscape, albeit at a slow pace. Adoption is still being hindered by the design methodologies and slow implementation cycles. Recently, methodologies based on High Level Synthesis (HLS) have begun to flourish and the reconfigurable supercomputing community is slowly adopting these techniques. In this paper we took a geophysics application...
Dynamic Partial Reconfiguration (DPR) enables software-like flexibility in hardware designs by allowing some of the logic on a Field Programmable Gate Array (FPGA) to be reconfigured while the rest continues to operate. However, such flexibility introduces challenges for verifying DPR design functionality because there is no straightforward way to simulate DPR at Register Transfer Level (RTL). This...
Soft-processors, instruction processors implemented in FPGA technology, are often customizable to support domain-specific optimization. However the correctness of customized soft-processors, executing the associated machine code, is often not obvious. This paper proposes a novel approach for verifying the implementation of an application program for a customized soft-processor, based on the ACL2 theorem...
This paper describes a design methodology to implement on FPGAs piecewise-affine (PWA) functions based on representation methods from the lattice theory. An off-line automatic processing starts at the algorithmic formulation of the problem, obtains the parameters required by a parameterized digital architecture, and ends with the bitstream to program an FPGA. The methodology has been proven to implement...
Artificial Neural Networks (ANN) have proven to be effective in solving various emerging biomedical applications through specialized ANN hardware. Unfortunately, these ANN-based biomedical systems are increasingly vulnerable to both transient and permanent faults, potentially imposing serious threats to human well-being. Inspired by the self-healing and self-recovery mechanisms of the human nervous...
Cryptographic message authentication is a growing need for FPGA-based embedded systems. In this paper a customized FPGA implementation of a GHASH function that is used in AES-GCM, a widely-used message authentication protocol, is described. The implementation limits GHASH logic utilization by specializing the hardware implementation on a per-key basis. The implemented module can generate a 128bit...
The recent emergence of 3D partially reconfigurable FPGAs implies that we need efficient online hardware task scheduling and placement algorithms for such architectures. However, the algorithms available in the literature for 3D FPGAs create a “blocking-effect”. That is, these algorithms tend to make a wrong decision in finding a location of each arriving hardware task during runtime scheduling and...
Due to the runtime flexibility of modern dynamically reconfigurable SRAM-based FPGAs, FPGA devices have become an attractive platform for developing system-on-chips (SoCs) for space applications (space SoCs). However, since the FPGA's SRAM is highly susceptible to space radiation, system reliability is a primary concern for space SoCs. To maintain system reliability and mitigate space radiation effects,...
Adaptive systems have the ability to respond to environmental conditions, by modifying their processing at runtime. While this is easy to do software systems, modern algorithms can be computationally expensive, requiring powerful processors. At the same time hardware is not as flexible. Field programmable gate arrays (FPGAs) are recognised as being suitable for adaptive systems implementation, due...
In many application domains, data are represented using large graphs involving millions of vertices and edges. Graph analysis algorithms, such as finding short paths and isomorphic subgraphs, are largely dominated by memory latency. Large cluster-based computing platforms can process graphs efficiently if the graph data can be partitioned, and on a smaller scale partitioning can be used to allocate...
In this paper we present a simulation framework for rapid testing of custom hardware peripherals designed to be incorporated in a System on Chip (SoC) architecture. The QEMU processor emulator is extended to allow attaching a simulation environment to the system bus, such that simulation can perform bus transactions, and interact with the emulated processor. We demonstrate multiple levels of simulation...
In this work, we explore heterogeneous computing hardware, including CPUs, GPUs and FPGAs, for scientific computing. We study system metrics such as throughput, energy efficiency and temperature, and formulate the problem of workload allocation among computing hardware in mathematical models with regards to the three metrics. The workload allocation approach is evaluated using Linpack on a hardware...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.