The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Vectorization has been an important method of using data-level parallelism to accelerate scientific workloads on vector machines such as Cray for the past three decades. In the last decade it has also proven useful for accelerating multimedia and embedded applications on short SIMD architectures such as MMX, SSE and AltiVec. Most of the focus has been directed at innermost loops, effectively executing...
This paper proposes an energy-efficient, high-throughput DRAM architecture for GPUs and throughput processors. In these systems, requests from thousands of concurrent threads compete for a limited number of DRAM row buffers. As a result, only a fraction of the data fetched into a row buffer is used, leading to significant energy overheads. Our proposed DRAM architecture exploits the hierarchical organization...
Higher integration lowers total cost of ownership (TCO) in the data center by reducing equipment cost and lowering energy consumption. However, higher integration also makes it difficult to achieve guaranteed quality of service (QoS) for shared resources. Unlike many other resources, memory bandwidth cannot be finely controlled by software in existing systems. As a result, many systems running critical,...
Dense small-cell deployments of 5G networks require a wireless backhaul to efficiently connect the small cells to the macro base station (BS). We envision a wireless backhaul architecture where cells are grouped into clusters. One small cell per cluster plays the role of a cluster head connecting the rest of the small cells to the macro cell via a mmWave MIMO link. We formulate the problem of jointly...
The challenges to push computing to exaflop levels are difficult given desired targets for memory capacity, memory bandwidth, power efficiency, reliability, and cost. This paper presents a vision for an architecture that can be used to construct exascale systems. We describe a conceptual Exascale Node Architecture (ENA), which is the computational building block for an exascale supercomputer. The...
Convolutional neural network (CNN) finds applications in a variety of computer vision applications ranging from object recognition and detection to scene understanding owing to its exceptional accuracy. There exist different algorithms for CNNs computation. In this paper, we explore conventional convolution algorithm with a faster algorithm using Winograd's minimal filtering theory for efficient FPGA...
Historically, improvements in GPU-based high performance computing have been tightly coupled to transistor scaling. As Moore's law slows down, and the number of transistors per die no longer grows at historical rates, the performance curve of single monolithic GPUs will ultimately plateau. However, the need for higher performing GPUs continues to exist in many domains. To address this need, in this...
The Internet Engineering Task Force (IETF) has set up the 6TiSCH Working Group (WG) to focus on enabling IPv6 over the Time Slotted Channel Hoping (TSCH) mode of the IEEE 802.15.4-2015 standard. This paper describes our Contiki implementation of the 6TiSCH operation sublayer, 6top, which is used for dynamic scheduling of bandwidth between the neighboring sensor nodes and facilitate the On the Fly...
Signal Processing applications with an end to end analog interface are primitively implemented by using a hardcoded DSP or MCU unit to execute predefined algorithms. However, when a need to reconfigure the existent system arises, it involves reprogramming the whole system once again. Also in the case of an iterative process, while fine tuning the system, such an approach is tedious in nature. Hardware...
Involving optical networking in the data center network (DCN) architecture is an emerging trend. Optical networking with advantages of high capacity and cost efficiency, would improve the network throughput for the cloud platform and improve the performance for serving big data applications. How to realize the network-aware resource provisioning for the big data applications in the optical DCN architecture...
The application of a leaf-spine switching architecture is considered for Layer 1 circuit switching, such as with OTN ODU-based switching. Simple relationships are derived for the achievable leaf-spine switch capacity and scalability, given the number and capacity of individual leaf and spine switch elements, and bandwidth for leaf-spine interlinks. Considerations for optimizing cost with respect to...
The distributed, hop-by-hop routing architecture used in the Internet depends on algebraic properties of routing metrics to ensure traffic is forwarded over loop-free and best (LFB) paths. As the Internet evolves to serve as the converged communication infrastructure for the 21st century, the need for new metrics that violate these properties ([5], [9]) has been identified. Until recently, the behavior...
Cloud Gaming is emerging as a viable alternative to traditional stand-alone and online computer games. The current trend is to design efficient remote-rendering models for cloud gaming that allow video packets of game scenes to be sent across, from servers to clients, for an optimal game experience. This approach, however, still raises concerns in terms of network bandwidth and costs associated, becoming...
Processing-in-Memory (PIM), has recently been revisited as one of the most promising solutions to deal with the issue of bandwidth and power wall between processor and memory. In this paper, we propose a light-weight PIM architecture, approxPIM, which leverages approximate computing techniques to enable InMemory Processing in a realistic 3D-stacked DRAM, Micron's Hybrid Memory Cube (HMC). Using the...
Medical Big Data (MBD) is the most critical form of information. MBD over a colossal Wireless Network (WN) is a sophisticated area of concern. Every day WN is becoming more dense and complicated. The MBD can be saved to wireless network after applying suitable Wavelet Compression. The quality of MBD is the most important factor; which means efficient and reliable transmission. Wavelet Compression...
The understanding of application characteristics such as hardware resource requirements and communication patterns is key in building highly utilized high performance computing systems for target workloads at a reasonable cost and with available technology. The characterization drives the design decision of both hardware and software. Memory access pattern is a key factor as data movement is a major...
Underwater computing systems have recently emerged as a new technology for underwater applications. In our previous work, we employed various nodes (processing nodes, gateway nodes, and sensing nodes) that are used to construct different candidate computing architectures (i.e. single, pipeline, and hybrid of parallel/pipeline). In this paper, we use these computing architectures in a real world scenario...
Today storage is the worst bottleneck for video surveillance systems, which are integral part of any institution. We trust our privacy and security to Network Video Recording (NVR) devices, which is programmed to overwrite all the way from start once finished up. There are chances that really important video data can be get deleted on the run while the less important still stays somewhere on the storage...
CMOS logarithmic amplifiers based on piecewise linear approximation have been usually analyzed and designed by assuming ideal conditions. This paper discusses non-ideal factors in the used components, which play a key role in the performance of high-accuracy logarithmic amplifiers for sensing applications, and shows that the use of simplified mathematical models may lead to wrong conclusions. A gain-mismatch...
5G Radio Access Networks (RANs) are supposed to increase their capacity by 1000x to handle growing number of connected devices and increasing data rates. The concept of cloud-RAN (CRAN) has been recently proposed to decouple digital units (DUs) and radio units (RUs) of base stations (BSs), and centralize DUs into central offices. CRAN can ease the implementation of advanced radio coordination techniques,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.