The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Fair arbitration in the access to hardware shared resources is fundamental to obtain low worst-case execution time (WCET) estimates in the context of critical real-time systems, for which performance guarantees are essential. Several hardware mechanisms exist for managing arbitration in those resources (buses, memory controllers, etc.). They typically attain fairness in terms of the number of slots...
Embedded processors must rely on the efficient use of instruction-level parallelism to answer the performance and energy needs of modern applications. However, a limiting factor to better use available resources inside the processor concerns memory bandwidth. Adding extra ports to allow for more data accesses drastically increases costs and energy. In this paper, we present a novel memory architecture...
Energy efficient router is one of the most important and promising devices in the roadmap towards green communication and networking. In recent years, automatical power scaling adapting with real-time network traffic in a router has been proved to be practical on real hardware, which is an implementation under the traffic aware philosophy. In this paper, we further explore this direction, and present...
The applications of speech interfaces, commonly used for search and personal assistants, are diversifying to include wearables, appliances, and robots. Hardware-accelerated automatic speech recognition (ASR) is needed for scenarios that are constrained by power, system complexity, or latency. Furthermore, a wakeup mechanism, such as voice activity detection (VAD), is needed to power gate the ASR and...
Energy management is a key issue for mobile devices. On current Android devices, power management relies heavily on OS modules known as governors. These modules are created for various hardware components, including the CPU, to support DVFS. They implement algorithms that attempt to balance performance and power consumption. In this paper we make the observation that the existing governors are (1)...
Task colocation improves datacenter utilization but introduces resource contention for shared hardware. In this setting, a particular challenge is balancing performance and fairness. We present Cooper, a game-theoretic framework for task colocation that provides fairness while preserving performance. Cooper predicts users' colocation preferences and finds stable matches between them. Its colocations...
With current DRAM technology reaching its limit, emerging heterogeneous memory systems have become attractive to keep the memory performance scaling. This paper argues for using a small, fast memory closer to the processor as part of a flat address space where the memory system is composed of two or more memory types. OS-transparent management of such memory has been proposed in prior works such as...
During the transition to packet-switched on-chip networks we lose the relative timing and ordering of requests, which are essential for shared memory coherency and the communication of spikes in hardware-based artificial neural networks. We present a bufferless network architecture that enforces a time-based sharing of multi-hop single-cycle paths, providing guaranteed services at low cost. We guarantee...
Approximate computing aims to expose and exploit quality vs. efficiency tradeoffs to enable ever-more demanding applications on energy-constrained devices such as smartphones, or IoT devices. This paper makes the case for arbitrary quantization as a compelling approximation technique that exposes quality vs. energy tradeoffs and provides practical error guarantees. We present QAPPA (Quality Autotuner...
Signal Processing applications with an end to end analog interface are primitively implemented by using a hardcoded DSP or MCU unit to execute predefined algorithms. However, when a need to reconfigure the existent system arises, it involves reprogramming the whole system once again. Also in the case of an iterative process, while fine tuning the system, such an approach is tedious in nature. Hardware...
There is a spectrum of solutions are available for distributing content over the Internet today. One of these solutions is Content distribution networks (CDN). CDN need to make decisions, such as server selection and routing, to improve a performance of content distribution. But we must remember, that performance may be limited by various factors such as packet loss in the network, a small receive...
The understanding of application characteristics such as hardware resource requirements and communication patterns is key in building highly utilized high performance computing systems for target workloads at a reasonable cost and with available technology. The characterization drives the design decision of both hardware and software. Memory access pattern is a key factor as data movement is a major...
Cache hierarchies have long been utilized to minimize the latency of main memory accesses by caching frequently used data closer to the processor. Significant research has been done to identify the most crucial metrics of cache performance. Though the majority of research focuses on measuring cache hit rates and data movement as the major cache performance metrics, cache utilization can be equally...
For many intensive computing tasks, simultaneous data access into multi-dimensional data arrays is highly restricted by its data mapping strategy and memory port constraint. As such, to increase memory accessing bandwidth, innovative memory partitioning and mapping algorithms have been proposed to simultaneously access multiple memory blocks through physically distributing data elements in the same...
The increasing popularity and ubiquity of various large graph datasets has caused renewed interest for graph partitioning. Existing graph partitioners either scale poorly against large graphs or disregard the impact of the underlying hardware topology. A few solutions have shown that the nonuniform network communication costs may affect the performance greatly. However, none of them considers the...
Network Function Virtualization (NFV) explores the virtualization technologies to offer Network-as-a- Service (NaaS) through connected virtual network functions. The network operations that were previously performed by specialized hardware are consolidated as software-based virtual network functions (VNFs). These VNFs can be implemented in the telecom clouds with high volume servers, switches and...
As more and more data-intensive applications have been moved to the cloud, the cloud network has become the new performance bottleneck for cloud applications. To boost application performance, the concept of coflow has been proposed to bring application-awareness into the cloud network. A coflow consists of many individual data flows, and a coflow is completed only when all its component flows are...
Software diagnosis on MPSoCs, the process of finding functional bugs or performance inefficiencies in complex hardware-software systems, is challenging. As both software and hardware complexity grow, the software observability decreases. At the same time, understanding the intended software behavior has become more difficult. We present an integrated approach which combines domain-specific representations...
This paper describes work with end goal of quantifying the impact of threading on MPI performance models in order to enable justifications of model construction methods. To do so, it evaluates benchmarks on a specific, representative networked platform, and makes these contributions: 1) it evaluates the performance of point-to-point transmission between two multithreaded Message Passing Interface...
Reliable and secure wireless communications is fundamental to connectivity between smarter gadgets and people. This is much more true in defence applications with its unique requirements and challenges. Broadly the data link requirements in defence applications are seen under two categories. One, robust fool-proof zero downtime data links for telemetry, command and control. Second, broad band data...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.