The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Column-store in-memory databases have received a lot of attention because of their fast query processing response times on modern multi-core machines. Among different database operations, group by/aggregate is an important and potentially costly operation. Moreover, sort-based and hash-based algorithms are the most common ways of processing group by/aggregate queries. While sort-based algorithms are...
Gesture recognition is used for many practical applications such as human-robot interaction, medical rehabilitation and sign language. In this paper, we apply a hybrid generative-discriminative approach by using the Fisher Vector to improve the recognition performance. The strategy is to merge the generative approach of Hidden Markov Model dealing with spatio-temporal motion data with the discriminative...
Supporting network I/O at high packet rates in virtual machines is fundamental for the deployment of Cloud data centers and Network Function Virtualization. Historically, SR-IOV and hardware passthrough were thought as the only viable solution to reduce the high cost of virtualization. In previous work [15] we showed how even plain device emulation can achieve VM-to-VM speeds of millions of packets...
Performance analysis of a process plays a significant role in improving the overall efficiency of any system. Usually, this task is accomplished either by system level commands or user space applications, based on proc file system. These existing user space based mechanisms are limited in application and often fail to provide the required process specific data to user. In order to avoid this limitation,...
GPUs devices are becoming critical building blocks of High-Performance platforms for performance and energy efficiency reasons. As a consequence, parallel programming environment such as OpenMP were extended to support offloading code to such devices. OpenMP compilers are faced with offering an efficient implementation of device-targeting constructs.One main issue in implementing OpenMP on a GPU is...
Developing device drivers is important for innovative consumer electronics because device driver implements key functionalities of new devices. This paper suggests a test-driven development (TDD) of device drivers, taking advantage of user-level driver. Applying TDD to device drivers is difficult because usually device drivers are implemented inside kernel, and are tightly coupled with complex kernel...
Conventional servers have achieved high performance by employing fast CPUs to run compute-intensive workloads, while making operating systems manage relatively slow I/O devices through memory accesses and interrupts. However, as the emerging workloads are becoming heavily data-intensive and the emerging devices (e.g., NVM storage, high-bandwidth NICs, and GPUs) come to enable low-latency and high-bandwidth...
The need for novel data analysis is urgent in the face of a data deluge from modern applications. Traditional approaches to data analysis incur significant data movement costs, moving data back and forth between the storage system and the processor. Emerging Active Flash devices enable processing on the flash, where the data already resides. An array of such Active Flash devices allows us to revisit...
Economic performance evaluation and classification is an important and challenging issue and has been gaining attention the last three decades of academic research, monetary institutions groups and business development. The purpose of this paper is to propose a hybrid model which combines support vector machine with isometric feature mapping (ISOMAP), Principal Component Analysis (PCA) and Locally...
Various malicious applications trend to access the user's files to achieve their functionalities. Such unauthorized file accesses may bring on the user data leakage or other threats. In this paper, we propose a novel light-weight hardware-assisted hyper visor, namely FVisor, to thwart such unauthorized file accesses. FVisor has three distinct advantages over existing hyper visor/host-based approaches:...
Virtualization has become a central role in HPC Cloud due to easy management and low cost of computation and communication. Recently, Single Root I/O Virtualization (SR-IOV) technology has been introduced for high-performance interconnects such as InfiniBand and can attain near to native performance for inter-node communication. However, the SR-IOV scheme lacks locality aware communication support,...
We present the design and testing of a hybrid energy neutral cooling system for data centers' CPUs. The system operates as a passive heat-sink at normal operating conditions, and can provide active cooling when a boost in performance is required (i.e., overclocking) at zero cost by exploiting thermoelectric generators (TEGs) to harvest the energy from the CPU heat dissipation. Server rooms have plenty...
OpenACC is gaining momentum as an implicit and portable interface in porting legacy CPU-based applications to heterogeneous, highly parallel computational environment involving many-core accelerators such as GPUs and Intel Xeon Phi. OpenACC provides a set of loop directives similar to OpenMP for the parallelization and also to manage data movement, attaining functional portability across different...
With the emergence of multi-core and multi-socket non-uniform memory access (NUMA) platforms in recent years, new software challenges have arisen to use them efficiently. In the field of high performance computing (HPC), parallel programming has always been the key factor to improve applications performance. However, the implications of parallel architectures in the system software has been overlooked...
We propose an approach for benchmark workload generation. The proposed workload synthesis generates synthetic workloads that model the behavior of real applications. Statistical execution profile of a workload is constructed from hardware performance counters available in recent processors, and the overhead of profiling is significantly lower than instrumentation or simulation which requires inspection...
GPUs have gained tremendous popularity in a broad range of application domains. These applications possess varying grains of parallelism and place high demands on compute resources -- many times imposing real-time constraints, requiring flexible work schedules, and relying on concurrent execution of multiple kernels on the device. These requirements present a number of challenges when targeting current...
Virtual switches, like Open vSwitch, have emerged as an important part of cloud networking architectures. They connect interfaces of virtual machines and establish the connection to the outer network via physical network interface cards. Today, all important cloud frameworks support Open vSwitch as the default virtual switch. However, general understanding about the performance implications of Open...
Checkpointing is the act of saving the state of a running program so that it may be recovered later, which is a general idea that enables various functionalities in computer systems, including fault tolerance, system recovery, and process migration. Checkpointing mechanisms in traditional systems normally save the state of process running on volatile memory to a checkpoint file stored on non-volatile...
As power becomes an increasingly important design factor in high-end supercomputers, future systems will likely operate with power limitations significantly below their peak power specifications. These limitations will be enforced through a combination of software and hardware power policies, which will filter down from the system level to individual nodes. Hardware is already moving in this direction...
Parallel and distributed systems that support the shared memory paradigm are becoming widely accepted in many areas of computing. The memory consistency model of a shared-memory multiprocessor system influences both the performance and the programmability of the system. Under optimal condition it is found that multithreading contributes to more than 50 percent of performance improvement, while the...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.