The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Since the new technologies like big data and cloud computing require tremendous amount of transactions between processors and memory, researches on a new memory system called Processing in Memory (PIM) architecture has been suggested as a solution for those memory intensive applications. To make software utilize the new architecture, a development environment with tool chain and debug infrastructures...
With the onset of multi- and many-core chips, the single-core market is closing down. Those chips constitute a new challenge for aerospace and safety-critical industries in general. Little is known about the certification of software running on these systems. There is therefore a strong need for developing software architectures based on multi-core architectures, yet compliant with safety-criticality...
We propose three methods for reducing power consumption in high-performance FPGAs (field programmable gate arrays). We show that by using continuous hierarchy memory, lightweight checks, and lower chip voltage for near-threshold voltage computation, we can both reduce power consumption and increase reliability without a decrease in throughput. We have implemented these techniques in two different,...
Programming models like CUDA, OpenMP, OpenACC and OpenCL are designed to offload compute-intensive workloads to accelerators efficiently. However, the naive offload model, which synchronously copies and executes in sequence, requires extensive hand-tuning of techniques, such as pipelining to overlap computation and communication. Therefore, we propose an easy-to-use, directive-based pipelining extension...
Non-equispaced fast Fourier transform (NFFT) has attracted significant interest for its applications in tomography and remote sensing where visualization and image reconstruction require non-equispaced data. Here we present an efficient implementation of high accuracy NFFT on an NVidia GPU (Graphic Processing Unit). We focused on the convolution step in the computation of NFFT, since it is the most...
Despite its popularity, deploying Convolutional Neural Networks (CNNs) on a portable system is still challenging due to large data volume, intensive computation and frequent memory access. Although previous FPGA acceleration schemes generated by high-level synthesis tools (i.e., HLS, OpenCL) have allowed for fast design optimization, hardware inefficiency still exists when allocating FPGA resources...
Memory deduplication improves memory density by merging identical memory pages in multi-tenanted cloud. However, memory deduplication is vulnerable to memory disclosure attacks and covert channel attacks. The covert channel bases on the difference in write access time on deduplicated memory pages that are re-created by Copy-on-Write technique. Prior works have shown that malicious attackers can make...
This paper introduces a software policy for memory management in heterogeneous memory systems in order to improve the trade-offs between performance and power consumption, while attempting to make the best use of different characteristics of the underlying memory technologies. In this policy, the operating system and the application co-schedule page management in order to make informed decisions about...
In this paper, we advocate the use of code polymorphism as an efficient means to improve security at several levels in electronic devices. We analyse the threats that polymorphism could help thwart, and present the solution that we plan to demonstrate in the scope of a collaborative research project called COGITO. We expect our solution to be effective to improve security, to comply with the computing...
In this paper, we proposed an incremental kernel non-negative matrix factorization (IKNMF) to reduce the computing scale in hyperspectral unmixing. Kernel non-negative matrix factorization (KNMF) is an extended non-negative matrix factorization (NMF) able to capture nonlinear dependency features in data matrix through kernel functions. In KNMF algorithm, the size of kernel matrices is closely associated...
In presence of known and unknown vulnerabilities in code and flow control of programs, virtual machine alike isolation and sandboxing to confine maliciousness of process, by monitoring and controlling the behaviour of untrusted application, is an effective strategy. A confined malicious application cannot effect system resources and other applications running on same operating system. But present...
In the new era of cyber-physical systems, software must adapt itself to ever-changing environmental conditions and situations. This is currently not reflected in the design of embedded operating systems, since they are primarily optimized for fixed usage scenarios with tight resource constraints. We discuss the idea of interpreted operating system kernels, which can form a new foundation for highly...
The survivability of OS is very important for the whole system because OS is the base of information system or network system. Based on the analysis of resources, services and functions of the OS, this paper proposed the concept of a integrity running environment (IRE) owing to the particularity of the OS survivability, and then, puts forward the new definition, namely the OS survivability is that...
There are multiple approaches for SLAM, but we found the the ones implemented in ROS had problems when a robot drove over small obstacles. This paper presents a proposal to make a more robust SLAM by running three SLAM methods in parallel and using their information to produce a better estimate of the robot's surroundings. The proposed method defines its output by making the three methods vote for...
The recent advent of stacked memory devices has led to a resurgence of researchassociated with the fundamental memory hierarchy and associated memory pipeline. The bandwidth advantages provided by stacked logic and DRAM devices haveinspired research associated with eliminating the bandwidth bottlenecksassociated with many applications in high performance computing. Further, recent efforts have focused...
High utilization of hardware resources is the key for designing performance and power optimized GPUapplications. The efficiency of applications and kernels, which do not fully utilize the GPU resources, can be improved through concurrent execution with independent kernels and/or applications. Hyper-Q enables multiple CPU threads or processes to launch work on a single GPU simultaneously for increased...
In the field of embedded vision systems, meeting the constraints on design criteria such as performance, area, and power consumption can be a real challenge. In fact, to alleviate the well known “Memory Mall”, it is mandatory to provide efficient memory hierarchies to reach usable performance for the system to be designed when it has to handle non-linear image treatments. To address this problematic,...
Traditional PC based operating systems load most of its components during the boot process along with the kernel. This mechanism though effective for a broader objective, is seldom utilized fully by a majority of users as they usually perform a specific job which does not require every component of OS. It has been observed that operating systems which are designed keeping in mind the nature of job,...
With the ever increasing demand on interactive data analytics, latency for big data frameworks becomes more important. We present our preliminary experience designing and implementing NetSpark, an improved Spark [1] framework that is highly optimized for network latency. Combining optimizations on data serialization, network buffer management with hardware-supported Remote Direct Memory Access (RDMA)...
Effective use of the memory hierarchy is crucial to cloud computing. Platform memory subsystems must be carefully provisioned and configured to minimize overall cost and energy for cloud providers. For cloud subscribers, the diversity of available platforms complicates comparisons and the optimization of performance. To address these needs, we present X-Mem, a new open-source software tool that characterizes...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.