The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
OpenCL continues to gather momentum on both desktop and mobile devices. The new features of OpenCL 2.0 provides developers better expressive power in programming heterogeneous computing environments. Currently in the experimental simulation environment, gem5-gpu only supports CUDA, but GPGPU-Sim can support OpenCL by compiling OpenCL kernel code to PTX using real GPU driver. However, this driver compilation...
The convolutional neural network (CNN) is a state-of-the-art model that can achieve significantly high accuracy in many machine-learning tasks. Recently, for further developing the practical applications of CNNs, efficient hardware platforms for accelerating CNN have been throughly studied. A binarized neural network has been reported to minimize the multipliers, which consume a large amount of resources,...
This paper bring a description of ‘HSCoT’, an efficient high level synthesis tool generating register transfer level (RTL) specifications for applications written entirely in C language and an associate reliable approach for speeding applications execution. It's based on dependency data flow graph construction and aims to explore maximally the inherent intrinsic parallelism of application. Application...
In this paper, we describe the Heterogeneous System Architecture Foundation's application to digital signal processors (DSP) and hardware accelerators. We provide an overview of the HSA runtime, system architecture and programmer's model, identify characteristics of DSPs and compare differences in algorithms to GPUs. We show an example mapping of HSA agents to a modern DSP using the HSA intermediate...
The biomedical imagery, the numeric communications, the acoustic signal processing and many others gls[dsp] applications are present more and more in the numeric world. They process growing data volume which is represented with more and more accuracy, and use complex algorithms with time constraints to satisfying. Consequently, a high requirement of computing power characterize them. To satisfy this...
Today's high performance embedded computing applications are posing significant challenges for processing throughout. Traditionally, such applications have been realized on application specific integrated circuits (ASICs) and/or digital signal processors (DSP). However, ASICs' advantage in performance and power often could not justify the fast increasing fabrication cost, while current DSP offers...
Programmable graphics processing unit (GPU) has over the years become an integral part of today's computing systems. The GPU use-cases have gradually been extended from graphics towards a wide range of applications. Since the programmable GPU is now making its way to mobile devices, it is interesting to study these new use-cases also there. To test this, we created a programming environment based...
LILY is a high performance VLIW DSP processor for multimedia applications, developed by Tsinghua University. The processor classifies the instructions, and determines whether the instructions should be issued in parallel according to the order of the instructions. Under this parallelism, LILY processor is capable of saving one bit of operation code in the condition of inserting very few no operation...
Stream processor, Inc. (SPI) has introduced the Storm-1 stream processing system-on-a-chip (SoC) which contains 16 data-parallel processing arrays with five very long instruction word (VLIW) arithmetic logic units (ALUs) in each lane (altogether 80 processing units). In the paper the implementation of emulated-digital cellular neural networks universal machine (CNN-UM) simulation kernel on the Storm-1...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.