The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Editor’s note: Online learning for data analysis, categorization and anomaly detection has become a key technique in a range of adaptive embedded applications. In this article the authors propose a very low power encoder design using a sparse, hyperdimensional representation of letters and words in natural language and they show that this representation and their design can be used with high energy...
The nearest neighbor (NN) algorithm has been used in a broad range of applications including pattern recognition, classification, computer vision, databases, etc. The NN algorithm tests data points to find the nearest data to a query data point. With the Internet of Things the amount of data to search through grows exponentially, so we need to have more efficient NN design. Running NN on multicore...
In this paper, we propose VoiceHD, a novel speech recognition technique based on brain-inspired hyperdimensional(HD) computing. VoiceHD maps preprocessed voice signals in the frequency domain to random hypervectors and combines them to compute a hypervector (as learned patterns) representing each class. During inference, VoiceHD similarly computes a query hypervector; the classification task is done...
In recent years, machine learning for visual object recognition has been applied to various domains, e.g., autonomous vehicle, heath diagnose, and home automation. However, the recognition procedures still consume a lot of processing energy and incur a high cost of data movement for memory accesses. In this paper, we propose a novel hardware accelerator design, called ORCHARD, which processes the...
Vertical Nanowire-FET (VNFET) is a promising candidate to succeed in industry mainstream due to its superior suppression of short-channel-effects and area efficiency. However, to design logic gates, CMOS is not an appropriate solution due to the process incompatibility with VNFET, which creates a technical challenge for mass production. In this work, we propose a novel VNFET-based logic design, called...
Today's computing systems use huge amount of energy and time to process basic queries in database. A large part of it is spent in data movement between the memory and processing cores, owing to the limited cache capacity and memory bandwidth of traditional computers. In this paper, we propose a non-volatile memory-based query accelerator, called NVQuery, which performs several basic query functions...
Neural networks are machine learning models that have been successfully used in many applications. Due to the high computational complexity of neural networks, deploying such models on embedded devices with severe power/resource constraints is troublesome. Neural networks are inherently approximate and can be simplified. We propose LookNN, a methodology to replace floating-point multiplications with...
Recently, neural networks have been demonstrated to be effective models for image processing, video segmentation, speech recognition, computer vision and gaming. However, high energy computation and low performance are the primary bottlenecks of running the neural networks. In this paper, we propose an energy/performance-efficient network acceleration technique on General Purpose GPU (GPGPU) architecture...
Internet of Things is capable of generating huge amount of data, causing high overhead in terms of energy and performance if run on traditional CPUs and GPUs. This inefficiency comes from the limited cache size and memory bandwidth which result in large amount of data movement through memory hierarchy. In this paper, we propose a configurable associative processor, called CAP, which accelerates computation...
Brain-inspired hyperdimensional (HD) computing emulates cognition tasks by computing with hypervectors as an alternative to computing with numbers. At its very core, HD computing is about manipulating and comparing large patterns, stored in memory as hypervectors: the input symbols are mapped to a hypervector and an associative search is performed for reasoning and classification. For every classification...
Many applications, such as machine learning and data sensing are statistical in nature and can tolerate some level of inaccuracy in their computation. Approximate computation is a viable method to save energy and increase performance by trading energy for accuracy. There are a number of proposed approximate solutions, however, they are limited to a small range of applications because they cannot control...
Recent years have witnessed a rapid growth in the domain of Internet of Things (IoT). This network of billions of devices generates and exchanges huge amount of data. The limited cache capacity and memory bandwidth make transferring and processing such data on traditional CPUs and GPUs highly inefficient, both in terms of energy consumption and delay. However, many IoT applications are statistical...
Running Internet of Things applications on general purpose processors results in a large energy and performance overhead, due to the high cost of data movement. Processing inmemory is a promising solution to reduce the data movement cost by processing the data locally inside the memory. In this paper, we design a Multi-Purpose In-Memory Processing (MPIM) system, which can be used as main memory and...
Modern microprocessors have increased the word width to 64-bits to support larger main memory sizes. It has been observed that data can often be represented by relatively few bits, so-called narrow-width values. To leverage narrow-width data, we propose a hybrid cache architecture composed of magnetic RAM (MRAM) and SRAM to save the upper and lower 32-bits of each word in MRAM and SRAM respectively...
Modern computing machines are increasingly characterized by large scale parallelism in hardware (such as GPGPUs) and advent of large scale and innovative memory blocks. Parallelism enables expanded performance tradeoffs whereas memories enable reuse of computational work. To be effective, however, one needs to ensure energy efficiency with minimal reuse overheads. In this paper, we describe a resistive...
Memory-based computing using associative memory has emerged as a promising solution to reduce the energy consumption of important classes of streaming applications such as multimedia by avoiding redundant computations. In associative memory, a set of frequent patterns that represent basic functions are pre-stored in ternary content addressable memory (TCAM) and reused. The primary limitation to using...
Static Random Access Memories (SRAMs) occupy a large area of today's microprocessors, and are a prime source of leakage power in highly scaled technologies. Low leakage and high density Spin-Transfer Torque RAMs (STT-RAMs) are ideal candidates for a power-efficient memory. However, STT-RAM suffers from high write energy and latency, especially when writing ‘one’ data. In this paper we propose a novel...
The Internet of things (IoT) significantly increases the volume of computations and the number of running applications on processors, from mobiles to servers. Big data computation requires massive parallel processing and acceleration. In parallel processing, associative memories represent a promising solution to improve energy efficiency by eliminating redundant computations. However, the tradeoff...
Variability is the real big challenge for integrated circuits. Today, simulators help to estimate the effect of variability, but fail to capture real workload dynamics and user interactions, which are fundamental to mobile devices. This paper presents VarDroid, a low-overhead tool to emulate power and performance variability on real platforms, running on top of the Android operating system. VarDroid...
Modern caches are designed to hold 64-bits wide data, however a proportion of data in the caches continues to be narrow width. In this paper, we propose a new cache architecture which increases the effective cache capacity up to 2X for the systems with narrow-width values, while also improving its power efficiency, bandwidth, and reliability. The proposed double capacity cache (DCC) architecture uses...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.