The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Finite State Machines (FSM) are widely used computation models for many application domains. These embarrassingly sequential applications with irregular memory access patterns perform poorly on conventional von-Neumann architectures. The Micron Automata Processor (AP) is an in-situ memory-based computational architecture that accelerates non-deterministic finite automata (NFA) processing in hardware...
A hybrid trie based approach for longest prefix match (LPM) search scheme is proposed in this paper to handle the current prefix growth in an efficient manner. The proposed scheme is built around two sub-schemes, the first one is the tree bitmap structure and the second one is the trie based data structure. The main idea of this approach is to simplify the required prefix operations, viz., insertion,...
The current Internet routing ecosystem is neither sustainable nor economical. More than 711K IPv4 routes and more than 41K IPv6 routes exist in current global Forwarding Information Base (FIBs) with growth rates increasing. This rapid growth has serious consequences, such as creating the need for costly FIB memory upgrades and increased potential for Internet service outages. And while FIB memories...
FPGAs are being incorporated into contemporary datacenters in order to improve computational capacity, power consumption, and processing latency. Efficiently integrating FP-GAs in datacenters is, however, quite challenging. Ideally, smaller tasks could share a device and the cloud management layer would be able to partially reconfigure the device to allocate its free resources to incoming tasks. Moreover,...
The architecture of the Microsoft Catapult II cloud places the accelerator (FPGA) as a bump-in-the-wire on the way to the network and thus promises a dramatic reduction in latency as layers of hardware and software are avoided. We demonstrate this capability with an implementation of the 3D FFT. Next we examine phased application elasticity, i.e., the use of a reduced set of nodes for some phases...
Anonymous network provides user privacy to protect identity. The onion routing (TOR) project is one kind of Internet anonymous networks which attracts many researchers and clients nowadays, because of its simplicity and scalability. However, there are some difficulties to analyze TOR performance within live TOR networks since it is distributed and its security nature. This paper presents a TOR network...
A novel multi-chip System-in-Package (SiP) was designed specifically for automotive applications. This paper discussed the challenges and approaches of enabling the dual x32 LPDDR4 channels with external DRAMs running at 1866 MHz. System level design space was explored to achieve better SI performance. Several key design parameters were studied separately to investigate their impacts on the SI performance...
The balance of searching time and storage space is a problem in routing lookup. The algorithm has solved it to some extent. It is based on IPv6 prefix distribution and adopts different approaches to divide and compress different prefixes. The prefixes that can be divided exactly are concentrated compression. Other prefixes that can't be divided exactly are handled with multi-branch tree method. According...
In this paper, we architect large-scale SRAM arrays with monolithic 3D (M3D) integration technology. We introduce M3D-based SRAM arrays with three different ways of integration: M3D-R (vertical routing-only), M3D-VBL (vertical bitline), and M3D-VWL (vertical wordline). We also apply M3D-based SRAM arrays to last-level caches: tag arrays for eDRAM LLCs and data arrays for SRAM LLCs. The proposed LLCs...
This paper discusses multi-point address channel design in fly-by topology for high speed memory interface. Waveform behaviors at DRAM locations along the channel are examined in depth with eye opening data in various channel design factors and device termination settings. Eye opening is exacerbated on the front DRAM from the controller more prominently due to ring-backs from high frequency reflections...
We can enhance the performance and efficiency of deflection-routed FPGA overlay NoCs by exploiting the cascading featureof the Xilinx UltraScale BlockRAMs. This allows us to (1) hardenthe multiplexers in the NoC switch crossbars, and (2) efficientlyadd buffering support to deflection-routing. While buffering isnot required for correct operation of a deflection routed NoC, it can boost network throughputs...
Static Random Access Memory (SRAM)-based routing multiplexers, whatever structure is employed, share a common limitation: their area, delay and power increase linearly with the input size. This property results in most SRAM-based FPGA architectures typically avoiding the use of large multiplexers. Resistive Random Access Memory (RRAM)-based multiplexers, built with one-level structure, have a unique...
Many important applications demand large amounts of on-chip memory both to fully utilize an FPGA's computational capacity and to minimize energy-consuming off-chip memory accesses, leading some recent commercial FPGAs to add higher-capacity on-chip block RAMs (BRAMs). While memory is becoming more important to FPGA designs, SRAM scaling is becoming more difficult because of increasing device variation...
This work analyzes the effect of the different design stages on the failure rate of circuits implemented in FPGAs. A bitstream-based SEU emulation platform is used to inject faults in order to analyze the critical bits of the circuit. Experiments are done on two different testbenchs, an FIR filter and a CORDIC chain. Tests consist on loading different variations of the designs in order to estimate...
As more than 40K service providers are advertising 600K or more IP prefixes, scalability of routing has emerged to be a matter of great concern. In this paper, to explore a spectrum of designs, we consider a Cloud-Assisted Routing (CAR) framework which follows a hybrid and opportunistic approach by keeping the high priority tasks at the router and use an adaptive router-cloud integration when beneficial...
With power consumption becoming a critical processor design issue, specialized architectures for low power processing are becoming popular. Several studies have shown that neural networks can be used for signal processing and pattern recognition applications. This study examines the design of memristor based multicore neural processors that would be used primarily to process data directly from sensors...
From the space and time dimension, the FPGA circuit is devised some levels with “computing unit + memory/register” via analyzing the characteristics of the FPGA circuit. Combined with the location importance, the connection degree among the nodes and their own soft error probability, an importance analysis model is proposed. And then the testing points are optimized based on the importance of each...
Embedded SRAM based memory sub-systems are an integral part of SoCs and have a large area footprint in modern SoCs today. Huge memory requirements are typically met by using an array of SRAM instances and optimal selection of these memory instances becomes imperative for SoC designers. We propose a framework based on the following approach: pre-sort a list of most suitable SRAM instances; create a...
We can embed the crossbar functionality of NoC (network-on-chip) routers onto the hard multiplexers of Xilinx DSP48E primitives to support resource efficient mapping of FPGA overlay NoCs. This embedding also permits the use of dedicated hard wiring resources of the DSP cascade links to support vertical NoC channels. This unique mapping allows us to significantly reduce soft logic (LUTs+FFs) utilization...
Router architecture plays an important role in a Network-on-chip design for achieving high throughput and low latency. In this paper, output buffer router has been emulated using the concept of Distributed Shared Buffer Router. Main focus of the design was to increase the throughput and lower the latency with minimum area and power overhead.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.