The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Multipath TCP (MPTCP) has gained great attention by the researchers and network application developers due to its features providing better bandwidth utilization and higher reliability recently. Utilizing MPTCP in the datacenters provides performance gain to the applications. If the underlying network has Software Defined Networking (SDN) architecture, the routing of the MPTCP subflows can be specialized...
In this paper, we present a Programmable SoC device with monolithically integrated RF-ADCs and RF-DACs in a 16nm FinFET process. The device includes quad ARM Cortex-53 and dual ARM Cortex-R5 processing subsystem, 750K programmable logic cells, 4000 DSP slices and 4 32Gb/s serial transceivers. Each 14-bit RF-DAC operates at a sample rate of up to 6.4GS/s and can directly synthesize RF carriers up to...
In ultra-dense cellular networks, research works on enhancing cell edge performance receive considerable attention. Based on interlaced clustering, we propose a heuristic sparse beamforming strategy to improve the cell edge throughput effectively in distributed antenna systems (DASs). In our scheme, each cluster pattern (CP) is divided into several adaptive cells, where all the remote antenna units...
This work presents a fully-differential wideband and low power 240 GHz multiplier-by-8 chain, manufactured in a 130 nm SiGe:C BiCMOS technology with fT /fmax = 300/500 GHz. A 30 GHz input signal is multiplied by 8 using a Gilbert cell based quadrupler and doubler and then amplified with 3-stage cascode amplifier. To achieve wide bandwidth and optimize for power consumption, the power budget has been...
The problem of managing the scarce radio spectrum resource is addressed in this paper; based upon the Worldwide Interoperability for Microwave Access (WiMAX) network. Emerging and innovative techniques such as spectrum virtualization and network federation are adopted to enable the sharing of spectrum amongst mobile operators. A novel entity known as the Virtual Spectrum Hypervisor (VS-Hypervisor)...
Ground personnel at the tactical edge often lack data and analytics that would increase their effectiveness. To address this problem, this work investigates methods to deploy cloud computing capabilities in tactical environments. Our approach is to identify representative applications and to design a system that spans the software/hardware stack to support such applications while optimizing the use...
As servers are equipped with more memory modules each with larger capacity, main-memory systems are now the second highest energy-consuming component in big-memory servers and their energy consumption even becomes comparable to processors in some servers. Meanwhile, it is critical for big-memory servers and their main-memory systems to offer high energy efficiency. Prior work exploited mobile LPDDR...
In this paper the non-uniform distributed power amplifier (NDPA) architecture is reviewed. Analysis of the structure highlights some of issues and limitations one encounters when utilizing this topology for the monolithic implementation of wideband power amplifiers. Existing techniques for mitigating these issues are then discussed along with published benchmarks for NDPA MMICs that demonstrate the...
Stream join is a fundamental and computationally expensive data mining operation for relating information from different data streams. This paper presents two FPGA-based architectures that accelerate stream join processing. The proposed hardware-based systems were implemented on a multi-FPGA hybrid system with high memory bandwidth. The experimental evaluation shows that our proposed systems can outperform...
The arch project is a suite of mini-apps that have been developed with consistent coding practices, under a common infrastructural layer. Great emphasis has been placed on making the applications concise and easy to manipulate, while capturing the key performance characteristics of their proxied algorithmic classes. The suite is intended for traditional exploration of performance, portability and...
The performance of computer networks relies on how bandwidth is shared among different flows. Fair resource allocation is a challenging problem particularly when the flows evolve over time. To address this issue, bandwidth sharing techniques that quickly react to the traffic fluctuations are of interest, especially in large scale settings with hundreds of nodes and thousands of flows. In this context,...
This paper presents an SDR (Software-Defined Radio) implementation of an FMCW (Frequency-Modulated Continuous-Wave) radar using a USRP (Universal Software Radio Peripheral) device. The tools used in the project and the architecture of implementation with FPGA real-time processing and PC off-line processing are covered. This article shows the detailed implementation of an FMCW radar using a USRP device...
In recent years, a lot of computer simulation codes have been developed as open-source software. Meanwhile major processors adopt a concept of a vector processing in high performance computing. Hence, the computer simulation codes need to follow a vector processing manner to have a benefit of a computational potential of the vector processing. Our study is evaluation and analysis of performance of...
The architecture of the Microsoft Catapult II cloud places the accelerator (FPGA) as a bump-in-the-wire on the way to the network and thus promises a dramatic reduction in latency as layers of hardware and software are avoided. We demonstrate this capability with an implementation of the 3D FFT. Next we examine phased application elasticity, i.e., the use of a reduced set of nodes for some phases...
The Internet of Things revolution requires long-battery-lifetime, autonomous end-nodes capable of probing the environment from multiple sensors and transmit it wirelessly after data-fusion, recognition, and classification. Duty-cycling is a well-known approach to extend battery lifetime: it allows to keep the hardware resources of the micro-controller implementing the end-node (MCUs) in sleep mode...
This paper describes an op-amp with a novel Class-AB Push-Pull output stage employing a “constant-transconductance” cell for keeping the amplifier gain-bandwidth product constant over different load conditions. A biasing scheme is also examined to define the quiescent current of the op-amp. The circuit is part of the current sensing scheme for a DC-DC Buck converter. The proposed system has been built...
Cloud and high-performance computing storage systems are comprised of thousands of physical storage devices and uses software that organize them into multiple data tiers based on access frequency. The characteristics of these devices lend themselves well to these tiers as devices have differing ratios of performance to capacity. Due to this, these systems have, for the past several years, incorporated...
Recently, architectures with scratchpad memory are gaining popularity. These architectures consist of low bandwidth, large capacity DRAM and high bandwidth, user addressable small capacity scratchpad. Existing algorithms must be redesigned to take advantage of the high bandwidth while overcoming the constraint on capacity of scratchpad. In this paper, we propose an optimized edge-centric graph processing...
CPU-GPU heterogeneous systems are emerging are emerging as architectures of choice for high-performance energy-efficient computing. Designing on-chip interconnects for such systems is challenging: CPUs typically benefit greatly from optimizations that reduce latency, but rarely saturate bandwidth or queueing resources. In contrast, GPUs generate intense traffic that produces local congestion, harming...
Similarity search is a key to important applications such as content-based search, deduplication, natural language processing, computer vision, databases, and graphics. At its core, similarity search manifests as k-nearest neighbors (kNN) which consists of parallel distance calculations and a top-k sort. While kNN is poorly supported by today's architectures, it is ideal for near-data processing because...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.