The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Creating quick and dirty prototypes is a simple and effective way to demonstrate the feasibility of new ideas in network research. Though, small scale proof-of-concepts may lack the performance needed to apply them to real world test cases. Thanks to powerful packet processing frameworks such as netmap and DPDK, high-performance packet forwarding systems can be implemented in software today.We present...
Network research relies on packet generators to assess performance and correctness of new ideas. Software-based generators in particular are widely used by academic researchers because of their flexibility, affordability, and open-source nature. The rise of new frameworks for fast IO on commodity hardware is making them even more attractive. Longstanding performance differences of software generation...
The IEEE 802.11ad standard allows wireless devices to operate in the unlicensed spectrum band of 60 GHz. By utilizing the channel with 2.16 GHz width, the devices are able to transmit at multi-Gigabit data rates that potentially satisfy demanding requirements of quality of services. Additionally, the advent of off-the-shelf IEEE 802.11ad device motivates research efforts to exploit this 60 GHz opportunity...
Virtual Private Networks (VPN) are an established technology that provides users a way to achieve secure communication over an insecure communication channel, such as the public Internet. It has been widely accepted due to its flexibility and availability on many platforms. It is often used as an alternative to expensive leased lines. In traditional setups, VPN endpoints are set up in hardware appliances,...
Modular multiplication is a fundamental and performance determining operation in various public-key cryptosystems. High-performance modular multipliers on FPGAs are commonly realized by several small-sized multipliers, an adder tree for summing up the digit-products, and a reduction circuit. While small-sized multipliers are available in pre-fabricated high-speed DSP slices, the adder tree and the...
Network function virtualization (NFV) introduces great flexibility in designing software-based network appliances to reduce cost and accelerate service deployment for network operators. However, with the fast development of high speed network of 100 GbE and beyond, how to efficiently design virtual network functions (VNF) on commodity servers has become a challenging problem. Although the advances...
This paper deals with the recently introduced class of Non-Surjective Finite Alphabet Iterative Decoders (NS-FAIDs). First, optimization results for an extended class of regular NS-FAIDs are presented. They reveal different possible trade-offs between decoding performance and hardware implementation efficiency. To validate the promises of optimized NS-FAIDs in terms of hardware implementation benefits,...
We present methods for the effective application level reordering of non-blocking RDMA operations. We supplement out-of-order hardware delivery mechanisms with heuristics to account for the CPU side overhead of communication and for differences in network latency: a runtime scheduler takes into account message sizes, destination and concurrency and reorders operations to improve overall communication...
High performance computing (HPC) applications have parallel code sections that must scale to large numbers of cores, which makes them sensitive to serial regions. Current supercomputing systems with heterogeneous or asymmetric CMPs (ACMP) combine few high-performance big cores for serial regions, together with many low-power lean cores for throughput computing. The low requirements of HPC applications...
State-of-the-art studies show that FPGA-based hardware merge sorters (HMSs) can achieve superior performance compared with optimized algorithms on CPUs and GPUs. The performance of any HMS is proportional to its operating frequency (F) and the number of records that can be output each cycle (E). However, all existing HMSs have a problem that F drops significantly with increasing E due to the increase...
State-of-the-art GPU chips are designed to deliver extreme throughput for graphics as well as for data-parallel general purpose computing workloads (GPGPU computing). Unlike graphics computing, GPGPU computing requires highly reliable operation. The performance-oriented design of GPUs requires to jointly evaluate the vulnerability of GPU workloads to soft-errors with the performance of GPU chips....
Markov Chain Monte Carlo (MCMC) algorithms are used to obtain samples from any target probability distribution and are widely used in stochastic processing techniques. Stochastic processing techniques such as machine learning and image processing need to compute large amounts of data in real-time, thus high throughput MCMC samplers are of utmost importance. Parallel Tempering (PT) MCMC has proven...
Combined (soft-hard) method for decoding block turbo-product codes is proposed in the paper. The method allows leveraging advantages of soft input data usage with the speed of hard-decoding procedure. The main peculiarity of the method is rule-based decoding stage. The proposed approach simplifies calculation procedure and reaches better correction ability than hard-decision decoder. Mathematical...
High-level decisions have the most impact on power consumption, but the effect of those decisions cannot be known until the hardware is implemented. This paper walks the reader through an industrial high-level low-power design methodology that enables the designer to consider and quantitatively evaluate a broad range of hardware implementations to find the most power-efficient architecture. This paper...
Abstract-In the present paper we demonstrate the novel technique to apply the recently proposed approach of In-Place Appends - overwrites on Flash without a prior erase operation. IPA can be applied selectively: only to DB-objects that have frequent and relatively small updates. To do so we couple IPA to the concept of NoFTL regions, allowing the DBA to place update-intensive DB-objects into special...
Encoder and decoder implementations of the High Efficiency Video Coding (HEVC) standard have been subject to many optimization approaches since the release in 2013. However, the real-time decoding of high quality and ultra high resolution videos is still a very challenging task. Especially entropy decoding (CABAC) is most often the throughput bottleneck for very high bitrates. Syntax Element Partitioning...
Multichannel active noise control (MCANC) systems are commonly used in acoustic noise or vibration control, such as large-dimension ventilation ducts, open windows and mechanical structures. However, its computational load far exceeds the capabilities of digital signal processors (DSPs) and microcontrollers. Even the field programmable gate array (FPGA) cannot straightforwardly cope with the exponential...
Energy efficiency presents a significant challenge for stochastic computing (SC) due to the long random binary bit streams required for accurate computation. In this paper, a type of low discrepancy (LD) sequences, the Sobol sequence, is considered for energy-efficient implementations of SC circuits. The use of Sobol sequences improves the output accuracy of a stochastic circuit with a reduced sequence...
Real-time anomaly detection for streaming data is a desirable feature for mobile devices or unmanned systems. The key challenge is how to deliver required performance under the stringent power constraint. To address the paradox between performance and power consumption, brain-inspired hardware, such as the IBM Neurosynaptic System, has been developed to enable low power implementation of large-scale...
The widening gap between the processor and memory performance has led to the inclusion of multiple levels of caches in the modern multi-core systems. Processors with simultaneous multithreading (SMT) support multiple hardware threads on the same physical core, which results in shared private caches. Any inefficiency in the cache hierarchy can negatively impact the system performance and motivates...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.