The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The Internet of Things revolution requires long-battery-lifetime, autonomous end-nodes capable of probing the environment from multiple sensors and transmit it wirelessly after data-fusion, recognition, and classification. Duty-cycling is a well-known approach to extend battery lifetime: it allows to keep the hardware resources of the micro-controller implementing the end-node (MCUs) in sleep mode...
Power consumption and high compute density are the key factors to be considered when building a compute node for the upcoming Exascale revolution. Current architectural design and manufacturing technologies are not able to provide the requested level of density and power efficiency to realise an operational Exascale machine. A disruptive change in the hardware design and integration process is needed...
We propose a highly structured neural network architecture for semantic segmentation with an extremely small model size, suitable for low-power embedded and mobile platforms. Specifically, our architecture combines i) a Haar wavelet-based tree-like convolutional neural network (CNN), ii) a random layer realizing a radial basis function kernel approximation, and iii) a linear classifier. While stages...
Real-time biosignal classification in power-constrained embedded applications is a key step in designing portable e-healtb devices requiring hardware integration along with concurrent signal processing. This paper presents an application based on a novel biomedical System-On-Chip (SoC) for signal acquisition and processing combining a homogeneous multi-core cluster with a versatile bio-potential front-end...
Unmanned Aerial Vehicles (UAVs) with high level autonomous navigation capabilities are a hot topic both in industry and academia due to their numerous applications. However, autonomous navigation algorithms are demanding from the computational standpoint, and it is very challenging to run them on-board of nano-scale UAVs (i.e., few centimeters of diameter) because of the limited capabilities of their...
One of the fundamental functionalities for autonomous navigation of Unmanned Aerial Vehicles (UAVs) is the hovering capability. State-of-the-art techniques for implementing hovering on standard-size UAVs process camera stream to determine position and orientation (visual odometry). Similar techniques are considered unaffordable in the context of nano-scale UAVs (i.e. few centimeters of diameter),...
Internet-of-Things devices need sensors with low power footprint and capable of producing semantically rich data. Promising candidates are spiking sensors that use asynchronous Address-Event Representation (AER) carrying information within inter-spike times. To minimize the overhead of coupling AER sensors with off-the-shelf microcontrollers, we propose an FPGA-based methodology that i) tags the AER...
Convolutional Neural Networks (CNNs) have revolutionized the world of image classification over the last few years, pushing the computer vision close beyond human accuracy. The required computational effort of CNNs today requires power-hungry parallel processors and GP-GPUs. Recent efforts in designing CNN Application-Specific Integrated Circuits (ASICs) and accelerators for System-On-Chip (SoC) integration...
Computer vision (CV) based on Convolutional Neural Networks (CNN) is a rapidly developing field thanks to CNN's flexibility, strong generalization capability and classification accuracy (matching and sometimes exceeding human performance). CNN-based classifiers are typically deployed on servers or high-end embedded platforms. However, their ability to “compress” low information density data such as...
The stringent power constraints of complex microcontroller based devices (e.g. smart sensors for the IoT) represent an obstacle to the introduction of sophisticated functionality. Programmable accelerators would be extremely beneficial to provide the flexibility and energy efficiency required by fast-evolving IoT applications; however, the integration complexity and sub-10mW power budgets have been...
The popularity of cloud computing has led to a dramatic increase in the number of data centers in the world. The ever-increasing computational demands along with the slowdown in technology scaling has ushered an era of power-limited servers. Techniques such as near-threshold computing (NTC) can be used to improve energy efficiency in the post-Dennard scaling era. This paper describes an architecture...
Overview. Today, most commercially available 3D display systems require the viewers to wear some sort of shutter-or polarization glasses, which is often regarded as inconvenience. Ideally, a 3D display system should not require the users to wear additional gear. In fact, the optimum would be a display that replicates the original light-field of a scene. So-called multiview aütostereoscopic displays...
Many-core architectures structured as fabrics of tightly-coupled clusters have shown promising results on embedded computer vision benchmarks, providing state-of-art performance with a reduced power budget. We propose PULP (Parallel processing Ultra-Low Power platform), an architecture built on clusters of tightly-coupled OpenRISC ISA cores, with advanced techniques for fast performance and energy...
High-frame-rate and high-resolution 3D medical ultrasound imaging imposes high requirements on the involved processing hardware. Several thousands of analog signals need to be processed in many steps to obtain a final image. Fully digital beamforming makes it possible to achieve high image quality coupled with extreme flexibility. Unfortunately, digital beamforming imposes staggering requirements...
This work describes how we use High-Level Synthesis to support design space exploration (DSE) of heterogeneous many-core systems. Modern embedded systems increasingly couple hardware accelerators and processing cores on the same chip, to trade specialization of the platform to an application domain for increased performance and energy efficiency. However, the process of designing such a platform is...
Radio communication is among the most energy consuming tasks in wireless sensor nodes. Reducing the amount of data to be transmitted holds a large power saving potential. The combination of compressed sensing (CS) and local signal parameter estimation can achieve a massive data rate reduction in applications where the primary interest is in the acquisition of a scalar feature of the signal rather...
Modern designs for embedded systems are increasingly embracing cluster-based architectures, where small sets of cores communicate through tightly-coupled shared memory banks and high-performance interconnections. At the same time, the complexity of modern applications requires new programming abstractions to exploit dynamic and/or irregular parallelism on such platforms. Supporting dynamic parallelism...
On-chip L2 cache architectures, well established in high-performance parallel computing systems, are now becoming a performance-critical component also for multi/many-core architectures targeted at lower-power, embedded applications. The very stringent requirements on power and cost of these systems result in one of the key challenges in many-core designs, mandating the deployment of highly efficient...
Modern designs for embedded many-core systems increasingly include application-specific units to accelerate key computational kernels with orders-of-magnitude higher execution speed and energy efficiency compared to software counterparts. A promising architectural template is based on heterogeneous clusters, where simple RISC cores and specialized HW units (HWPU) communicate in a tightly-coupled manner...
Negative bias temperature instability (NBTI) adversely affects the reliability of a processor by introducing new delay-induced faults. However, the effect of these delay variations is not uniformly spread across functional units and instructions: some are affected more (hence less reliable) than others. This paper proposes a NBTI-aware compiler-directed very long instruction word (VLIW) assignment...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.