The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The rapid development of the embedded systems and the wide use of them in many sensitive fields require safeguarding their communications. Internet Protocol Security (IPsec) is widely used to solve network security problems by providing confidentiality and integrity for the communications in the network, but it introduces communication overhead. This overhead becomes a critical factor with embedded...
In this paper we demonstrate techniques for increasing the node-level parallelism of a deterministic discrete ordinates neutral particle transport algorithm on a structured mesh to exploit many-core technologies. Transport calculations form a large part of the computational workload of physical simulations and so good performance is vital for the simulations to complete in reasonable time. We will...
OpenCL is a portable interface that can be used to program cluster nodes with heterogeneous compute devices. The OpenCL specification tightly binds its workflow abstraction, or "command queue," to a specific device for the entire program. For best performance, the user has to find the ideal queue -- device mapping at command queue creation time, an effort that requires a thorough understanding...
Combining several types of devices and architectures is at the heart of heterogeneous computing's power efficiency advantage, but the strength of heterogeneous systems is also their Achilles heel, i.e. the diversity of the devices and ecosystems needed to maintain them present major technological challenges. Some of the biggest challenges are in the realm of system programing. We believe that for...
Model-driven engineering (MDE) is a popular software development methodology in the software industry. Finding a predefined pattern in a domain-specific model can be requested in MDE. This technique can help in optimizing or refactoring the models or to translate from one language to another one. The goal of the current researching is to create a framework for MDE which can find patterns defined by...
Using multiple accelerators, such as GPUs or Xeon Phis, is attractive to improve the performance of large data parallel applications and to increase the size of their workloads. However, writing an application for multiple accelerators remains today challenging because going from a single accelerator to multiple ones indeed requires to deal with potentially non-uniform domain decomposition, inter-accelerator...
OpenCL is an open standard for programming of parallel heterogeneous systems. It is designed for portability, therefore being utilized in the area of embedded system programming as well as high performance computing (HPC). Due to the applicability on different platforms, OpenCL library vendors have a certain freedom in implementing parts of the OpenCL execution model. Multiple versions of the standard...
Community networks are IP networks constantly being improved that evolve into large-scale computing platforms. This has resulted from the effort to adapt the cloud computing model towards services that can operate and utilize the resources inside the community network. The network and its infrastructure are contributed by individuals, companies, organizations and are maintained by the community itself...
Software delays have become the bottleneck to the overall storage system. Particularly, in iSCSI-based storage system, more disks in the target-end system are configured to make up a target node, but requests initiated by the initiator-end system will go through the traditional single I/O path before they are re-requested by the target-end system, which cannot exploit the serviceability of the target...
Modern operating systems use mechanisms and instructions implemented in instruction set architectures to perform system calls, which typically involve a context switch from user space (unprivileged mode) to kernel space (privileged mode). By leveraging these mechanisms, we can implement inkernel system calls, which are system calls invoked from kernel space. We present a performance evaluation of...
The development of memory storage device technologies, such as next generation non-volatile (NV) memory and battery backed NV-DIMM, has been advanced recently, and they became widely recognized. They provide high performance and persistency along with byte addressability. Their byte addressability enables CPUs to directly access them. Despite their clear advantages, their limited capacity makes it...
Despite of Cloud infrastructures can be used as High Performance Computing (HPC) platforms, many issues from virtualization overhead had kept them unrelated. However, with advent of container-based virtualizers, this scenario acquires new perspectives because this technique promises to decrease the virtualization overhead, achieving a near-native performance. In this work, we analyzed the performance...
Computer vision (CV) is widely expected to be the next big thing in mobile computing. The availability of a camera and a large number of sensors in mobile devices will enable CV applications that understand the environment and enhance people's lives through augmented reality. One of the problems yet to solve is how to transfer demanding state-of-the-art CV algorithms —designed to run on powerful desktop...
Currently, Multipath TCP (MPTCP) — a modification to standard TCP that enables the concurrent use of several network paths in a single TCP connection — is being standardized by IETF. This paper provides a comprehensive evaluation of the use of MPTCP to reduce latency and thus improve the quality of experience or QoE for cloud-based applications. In particular, the paper considers the possible reductions...
We introduce an ultra-low-power digital signal processor (DSP) solution for wearable applications with high performance. It employs three-issue VLIW architecture with the major low-power techniques and implemented with 95K gates in Samsung 28LPP process and runs up to 200MHz. The experimental results demonstrate that a voice trigger application can operate at 6.1MHz under 0.15mW power consumption.
Modern operating system kernels, such as Linux, address the trade-off between portability and performance by exposing a generic interface to user space programs, while maintaining architecture-dependent functionality as a set of separate components inside the kernel space. In particular, performance can only be achieved by ensuring that the architecture-dependent code takes advantage of the facilities...
Smartphones play a key role in several aspects of our daily life. Their range of application is constantly growing, making them versatile and necessary. However, mobile devices face an important problem: they hold an important autonomy requirement, which is constantly challenged by the short life of batteries. Researchers and practitioners have proposed different strategies to preserve battery life...
The current paper proposes a solution for offloading compute-intensive string matching operations performed by modern network intrusion detection systems (NIDS) or ant viruses to the GPU using OpenCL's programing model and serialized implementations of Aho-Corasick algorithm. The solution aims to provide scalability and performance in heterogeneous environments.
The popular and diverse hardware accelerator ecosystem makes apples-to-apples comparisons between platforms rather difficult. SPEC ACCEL tries to offer a yardstick to compare different accelerator hardware and software ecosystems. This paper uses this SPEC benchmark to compare an AMD GPU, an NVIDIA GPU and an Intel Xeon Phi with respect to performance and energy consumption. It also provides observations...
Heterogeneous computing, which combines devices with different architectures, is rising in popularity, and promises increased performance combined with reduced energy consumption. OpenCL has been proposed as a standard for programing such systems, and offers functional portability. It does, however, suffer from poor performance portability, code tuned for one device must be re-tuned to achieve good...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.