The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Embedded image processing systems have many challenges, due to large computational requirements and other physical, power, and environmental constraints. However recent contemporary mobile devices include a graphical processing unit (GPU) in order to offer better use interface in terms of graphics. Some of these embedded GPUs also support OpenCL which allows the use of computation capacity of embedded...
While high-end heterogeneous systems are increasingly supporting heterogeneous uniform memory access (hUMA) as envisioned by the Heterogeneous System Architecture (HSA) foundation, their low-power counterparts targeting the embedded domain still lack basic features like virtual memory support for accelerators. As opposed to simply passing virtual address pointers, explicit data management involving...
Memory forensics analysis is an important area of digital forensics especially in incident response, malware analysis and behavior analysis (of application and system software) in physical memory. Traditional digital forensics, such as investigating non-volatile storage, cannot be used to establish the current state of the system (including network connections) or for analysis of malwares that use...
High Performance Computing (HPC) aggregates computing power in order to solve large and complex problems in different knowledge areas. Nowadays, HPC users can utilize virtualized infrastructures as a low-cost alternative to deploy their applications. However, virtualization brings some challenges for HPC, specially in regard to overhead caused by hyper visors. In this work, our main goal is to analyze...
We present an ASIC architecture with coarse-grain reconfigurability that uses accelerators to improve performance over fine-grain reconfigurable architectures. A reconfigurable FFT ASIC was built as a proof of concept, and it successfully demonstrated valid switch operation for reconfiguration.
A shift is underway in high performance computing (HPC) towards heterogeneous parallel architectures that emphasize medium and fine grain thread parallelism. Many scientific computing algorithms, including simple finite-differencing methods, have already been mapped to heterogeneous architectures with order-of-magnitude gains in performance as a result. Recent case studies examining high-resolution...
With the rate of errors that silently effect an application's state/output expected to increase in future HPC machines, numerous mitigation schemes have been proposed, but little work has been done investigating why these schemes detect some error while other is masked. This paper investigates how silent data corruption (SDC) propagates through a sparse matrix vector multiply (SpMV), a fundamental...
As the advantages of high performance and low power, the Loongson-1 processor has wide application prospects in industrial control, high-performance embedded, and other fields. Now the Loongson series platforms are mostly based on Linux operating system. However, VxWorks is a better choice for its high real-time performance and high reliability in the field of industrial control and high-performance...
Mobile devices based on flash memory have unique hardware characteristics. They have different memory management mechanisms such as no memory swapping and app cache. Android platform adopts new modules such as low memory killer (LMK), activity manager service (AMS) besides kswapd and out of memory killer (OOMK). However, these modules generate many Kernel function calls that incur sluggish responses...
A noise suppressing filter design technique to reduce deconvolution error of both-directions downward sloped asymmetrical long-tail distribution of the Random Telegraph Noise (RTN) is proposed. The filter is used in Lucy-Richardson-deconvolution (LRDec) iteration process. The deconvolution is required for inversely analyzing RTN long tail distribution effects on VLSI time-dependent operating margin...
The Single Instruction Multiple Thread (SIMT) architecture based, Graphic Processing Units (GPUs) are emerging as more efficient than Multiple Instruction Multiple Data (MIMD) architectures in exploiting parallelism. A GPU has numerous shader cores and thousands of simultaneous finegrained active threads. These threads are grouped into Cooperative Thread Arrays (CTAs). All the threads within a CTA...
In this paper, we present VLSI architecture of Pairwise Linear Support Vector Machine (SVM) classifier for multi-classification on FPGA. The objective of this work is to facilitate real time classification of the facial expressions into three categories: neutral, happy and pain, which could be used in a typical patient monitoring system. Thus, the challenge here is to achieve good performance without...
Smartphones emerge as one of the most coherent companion for humans over past few years. A memory crunch situation makes the user feel sluggishness while accessing the applications. So, Linux starts retrieving memory using kswapd or Direct Reclaim followed by Android Low Memory Killer which identifies victim processes to be killed on the basis of defined criteria until sufficient amount of memory...
In this paper, we propose an FPGA memory hierarchy based on the OpenCL memory model. The memory hierarchy allows application-specific memory optimizations during design compilation using information provided in OpenCL kernels. With the proposed memory hierarchy, FPGA application developers can focus on their designs in OpenCL kernel codes, and their designs can be synthesized into FPGA hardware via...
Memory analysis is now used routinely for incident response and forensic applications. Current memory analysis techniques are very effective in finding kernel artifacts of significance to the forensic investigator. However, the analysis of user space applications has not received enough attention so far. We identify the lack of pagefile support in analysis and acquisition as a major hurdle in the...
Permutation-based indexing is one of the most popular techniques for the approximate nearest-neighbor search problem in high-dimensional spaces. Due to the exponential increase of multimedia data, the time required to index this data has become a serious constraint of the indexing techniques. One of the possible steps towards faster index construction is utilization of massively parallel platforms...
Unified Memory is an emerging technology which is supported by CUDA 6.X. Before CUDA 6.X, the existing CUDA programming model relies on programmers to explicitly manage data between CPU and GPU and hence increases programming complexity. CUDA 6.X provides a new technology which is called as Unified Memory to provide a new programming model that defines CPU and GPU memory space as a single coherent...
As hard disk encryption, RAM disks, persistent data avoidance technology and memory resident malware become morewidespread, memory analysis becomes more important. In order to provide more virtual memory than is actually physicalpresent on a system, an operating system may transfer frames of memory to a pagefile on persistent storage. Current memoryanalysis software does not incorporate such pagefiles...
Attaching next-generation non-volatile memories (NVMs) to the main memory bus provides low-latency, byte-addressable access to persistent data that should significantly improve performance for a wide range of storage-intensive workloads. We present an analysis of storage application performance with non-volatile main memory (NVMM) using a hardware NVMM emulator that allows fine-grain tuning of NVMM...
Generally, 2-D DCT/IDCT (Two dimensional discrete cosine transform and its inverse) are widely used in many image processing systems. In this paper, efficient architectures are proposed. These architectures have parallel and pipelined structures which are used to implement 8×8 DCT/IDCT processors. These processors involve two 8-point DCT/IDCT processors along with a dual-bank of SRAM (128 words) and...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.