Search results

chapter

Towards a Combined Grouping and Aggregation Algorithm for Fast Query Processing in Columnar Databases with GPUs

Sina Meraji, John Keenleyside, Sunil Kamath, Bob Blainey

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 594 - 603

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

Column-store in-memory databases have received a lot of attention because of their fast query processing response times on modern multi-core machines. Among different database operations, group by/aggregate is an important and potentially costly operation. Moreover, sort-based and hash-based algorithms are the most common ways of processing group by/aggregate queries. While sort-based algorithms are...

chapter

Gesture recognition using hybrid generative-discriminative approach with Fisher Vector

Yusuke Goutsu, Wataru Takano, Yoshihiko Nakamura

2015 IEEE International Conference on Robotics and Automation (ICRA) > 3024 - 3031

2015 IEEE International Conference on Robotics and Automation (ICRA)

Gesture recognition is used for many practical applications such as human-robot interaction, medical rehabilitation and sign language. In this paper, we apply a hybrid generative-discriminative approach by using the Fisher Vector to improve the recognition performance. The strategy is to merge the generative approach of Hidden Markov Model dealing with spatio-temporal motion data with the discriminative...

chapter

Virtual device passthrough for high speed VM networking

Stefano Garzarella, Giuseppe Lettieri, Luigi Rizzo

2015 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS) > 99 - 110

2015 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS)

Supporting network I/O at high packet rates in virtual machines is fundamental for the deployment of Cloud data centers and Network Function Virtualization. Historically, SR-IOV and hardware passthrough were thought as the only viable solution to reduce the high cost of virtualization. In previous work [15] we showed how even plain device emulation can achieve VM-to-VM speeds of millions of packets...

chapter

Performance Analysis of Process Using Loadable Kernel Module (LKM)

Barun Kumar Parichha

2015 Fifth International Conference on Communication Systems and Network Technologies > 1338 - 1343

2015 Fifth International Conference on Communication Systems and Network Technologies (CSNT)

Performance analysis of a process plays a significant role in improving the overall efficiency of any system. Usually, this task is accomplished either by system level commands or user space applications, based on proc file system. These existing user space based mechanisms are limited in application and often fail to provide the required process specific data to user. In order to avoid this limitation,...

chapter

Coordinating GPU Threads for OpenMP 4.0 in LLVM

Carlo Bertolli, Samuel F. Antao, Alexandre E. Eichenberger, Kevin OBrien Zehra Sura, more

2014 LLVM Compiler Infrastructure in HPC > 12 - 21

2014 LLVM Compiler Infrastructure in HPC (LLVM-HPC)

GPUs devices are becoming critical building blocks of High-Performance platforms for performance and energy efficiency reasons. As a consequence, parallel programming environment such as OpenMP were extended to support offloading code to such devices. OpenMP compilers are faced with offering an efficient implementation of device-targeting constructs.One main issue in implementing OpenMP on a GPU is...

chapter

Test-driven development of consumer electronics device drivers: A user-level device driver approach

Seehwan Yoo, Young-pil Kim

2015 IEEE International Conference on Consumer Electronics (ICCE) > 392 - 394

2015 IEEE International Conference on Consumer Electronics (ICCE)

Developing device drivers is important for innovative consumer electronics because device driver implements key functionalities of new devices. This paper suggests a test-driven development (TDD) of device drivers, taking advantage of user-level driver. Applying TDD to device drivers is difficult because usually device drivers are implemented inside kernel, and are tightly coupled with complex kernel...

chapter

DCS: A fast and scalable device-centric server architecture

Jaehyung Ahn, Dongup Kwon, Youngsok Kim, Mohammadamin Ajdari, more

2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) > 559 - 571

2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)

Conventional servers have achieved high performance by employing fast CPUs to run compute-intensive workloads, while making operating systems manage relatively slow I/O devices through memory accesses and interrupts. However, as the emerging workloads are becoming heavily data-intensive and the emerging devices (e.g., NVM storage, high-bandwidth NICs, and GPUs) come to enable low-latency and high-bandwidth...

chapter

AnalyzeThis: an analysis workflow-aware storage system

Hyogi Sim, Youngjae Kim, Sudharshan S. Vazhkudai, Devesh Tiwari, more

SC15: International Conference for High Performance Computing, Networking, Storage and Analysis > 1 - 12

SC15: International Conference for High Performance Computing, Networking, Storage and Analysis

The need for novel data analysis is urgent in the face of a data deluge from modern applications. Traditional approaches to data analysis incur significant data movement costs, moving data back and forth between the storage system and the processor. Emerging Active Flash devices enable processing on the flash, where the data already resides. An array of such Active Flash devices allows us to revisit...

chapter

Economic performance evaluation and classification using hybrid manifold learning and support vector machine model

Songbian Zime

2014 11th International Computer Conference on Wavelet Actiev Media Technology and Information Processing(ICCWAMTIP) > 184 - 191

2014 11th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)

Economic performance evaluation and classification is an important and challenging issue and has been gaining attention the last three decades of academic research, monetary institutions groups and business development. The purpose of this paper is to propose a hybrid model which combines support vector machine with isometric feature mapping (ISOMAP), Principal Component Analysis (PCA) and Locally...

chapter

FVisor: Towards Thwarting Unauthorized File Accesses with a Light-Weight Hypervisor

Yan Wen, Jinjing Zhao, Shuanghui Yi, Xiang Li

2014 IEEE 17th International Conference on Computational Science and Engineering > 620 - 626

2014 IEEE 17th International Conference on Computational Science and Engineering (CSE)

Various malicious applications trend to access the user's files to achieve their functionalities. Such unauthorized file accesses may bring on the user data leakage or other threats. In this paper, we propose a novel light-weight hardware-assisted hyper visor, namely FVisor, to thwart such unauthorized file accesses. FVisor has three distinct advantages over existing hyper visor/host-based approaches:...

chapter

High performance MPI library over SR-IOV enabled infiniband clusters

Jie Zhang, Xiaoyi Lu, Jithin Jose, Mingzhe Li, more

2014 21st International Conference on High Performance Computing (HiPC) > 1 - 10

2014 21st International Conference on High Performance Computing (HiPC)

Virtualization has become a central role in HPC Cloud due to easy management and low cost of computation and communication. Recently, Single Root I/O Virtualization (SR-IOV) technology has been introduced for high-performance interconnects such as InfiniBand and can attain near to native performance for inter-node communication. However, the SR-IOV scheme lacks locality aware communication support,...

chapter

Energy neutral hybrid cooling system for high performance processors

Luca Rizzon, Maurizio Rossi, Roberto Passerone, Davide Brunelli

International Green Computing Conference > 1 - 6

2014 International Green Computing Conference (IGCC)

We present the design and testing of a hybrid energy neutral cooling system for data centers' CPUs. The system operates as a passive heat-sink at normal operating conditions, and can provide active cooling when a boost in performance is required (i.e., overclocking) at zero cost by exploiting thermoelectric generators (TEGs) to harvest the energy from the CPU heat dissipation. Server rooms have plenty...

chapter

An OpenACC Extension for Data Layout Transformation

Tetsuya Hoshino, Naoya Maruyama, Satoshi Matsuoka

2014 First Workshop on Accelerator Programming using Directives > 12 - 18

2014 First Workshop on Accelerator Programming using Directives (WACCPD)

OpenACC is gaining momentum as an implicit and portable interface in porting legacy CPU-based applications to heterogeneous, highly parallel computational environment involving many-core accelerators such as GPUs and Intel Xeon Phi. OpenACC provides a set of loop directives similar to OpenMP for the parallelization and also to manage data movement, attaining functional portability across different...

chapter

Evaluating Lustre's Metadata Server on a Multi-Socket Platform

Konstantinos Chasapis, Manuel F. Dolz, Michael Kuhn, Thomas Ludwig

2014 9th Parallel Data Storage Workshop > 13 - 18

2014 9th Parallel Data Storage Workshop (PDSW)

With the emergence of multi-core and multi-socket non-uniform memory access (NUMA) platforms in recent years, new software challenges have arisen to use them efficiently. In the field of high performance computing (HPC), parallel programming has always been the key factor to improve applications performance. However, the implications of parallel architectures in the system software has been overlooked...

chapter

Workload synthesis: Generating benchmark workloads from statistical execution profile

Keunsoo Kim, Changmin Lee, Jung Ho Jung, Won Woo Ro

2014 IEEE International Symposium on Workload Characterization (IISWC) > 120 - 129

2014 IEEE International Symposium on Workload Characterization (IISWC)

We propose an approach for benchmark workload generation. The proposed workload synthesis generates synthetic workloads that model the behavior of real applications. Statistical execution profile of a workload is constructed from hardware performance counters available in recent processors, and the overhead of profiling is significantly lower than instrumentation or simulation which requires inspection...

chapter

Runtime Support for Adaptive Spatial Partitioning and Inter-Kernel Communication on GPUs

Yash Ukidave, Charu Kalra, David Kaeli, Perhaad Mistry, more

2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing > 168 - 175

2014 26th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)

GPUs have gained tremendous popularity in a broad range of application domains. These applications possess varying grains of parallelism and place high demands on compute resources -- many times imposing real-time constraints, requiring flexible work schedules, and relying on concurrent execution of multiple kernels on the device. These requirements present a number of challenges when targeting current...

chapter

Performance characteristics of virtual switching

Paul Emmerich, Daniel Raumer, Florian Wohlfart, Georg Carle

2014 IEEE 3rd International Conference on Cloud Networking (CloudNet) > 120 - 125

2014 IEEE 3rd International Conference on Cloud Networking (CloudNet)

Virtual switches, like Open vSwitch, have emerged as an important part of cloud networking architectures. They connect interfaces of virtual machines and establish the connection to the outer network via physical network interface cards. Today, all important cloud frameworks support Open vSwitch as the default virtual switch. However, general understanding about the performance implications of Open...

chapter

A checkpointing and instant-on mechanism for a embedded system based on non-volatile memories

Jianwen Sun, Xiang Long, Han Wan, Jingwei Yang

2014 IEEE Computers, Communications and IT Applications Conference > 173 - 178

2014 IEEE Computing, Communications and IT Applications Conference (ComComAp)

Checkpointing is the act of saving the state of a running program so that it may be recovered later, which is a general idea that enables various functionalities in computer systems, including fault tolerance, system recovery, and process migration. Checkpointing mechanisms in traditional systems normally save the state of process running on volatile memory to a checkpoint file stored on non-volatile...

chapter

Adaptive Configuration Selection for Power-Constrained Heterogeneous Systems

Peter E. Bailey, David K. Lowenthal, Vignesh Ravi, Barry Rountree, more

2014 43rd International Conference on Parallel Processing > 371 - 380

2014 43nd International Conference on Parallel Processing (ICPP)

As power becomes an increasingly important design factor in high-end supercomputers, future systems will likely operate with power limitations significantly below their peak power specifications. These limitations will be enforced through a combination of software and hardware power policies, which will filter down from the system level to individual nodes. Hardware is already moving in this direction...

chapter

Analysis and realization of Relaxed Consistency Memory model for multi-core CPU or GPU

Ramanarayan Mohanty, Dipti Prakash Behera, Aurobinda Routray

2014 5th International Conference - Confluence The Next Generation Information Technology Summit (Confluence) > 866 - 870

2014 5th International Conference- Confluence The Next Generation Information Technology Summit

Parallel and distributed systems that support the shared memory paradigm are becoming widely accepted in many areas of computing. The memory consistency model of a shared-memory multiprocessor system influences both the performance and the programmability of the system. Under optimal condition it is found that multithreading contributes to more than 50 percent of performance improvement, while the...

INFONA - science communication portal

Search results

Towards a Combined Grouping and Aggregation Algorithm for Fast Query Processing in Columnar Databases with GPUs

Gesture recognition using hybrid generative-discriminative approach with Fisher Vector

Virtual device passthrough for high speed VM networking

Performance Analysis of Process Using Loadable Kernel Module (LKM)

Coordinating GPU Threads for OpenMP 4.0 in LLVM

Test-driven development of consumer electronics device drivers: A user-level device driver approach

DCS: A fast and scalable device-centric server architecture

AnalyzeThis: an analysis workflow-aware storage system

Economic performance evaluation and classification using hybrid manifold learning and support vector machine model

FVisor: Towards Thwarting Unauthorized File Accesses with a Light-Weight Hypervisor

High performance MPI library over SR-IOV enabled infiniband clusters

Energy neutral hybrid cooling system for high performance processors

An OpenACC Extension for Data Layout Transformation

Evaluating Lustre's Metadata Server on a Multi-Socket Platform

Workload synthesis: Generating benchmark workloads from statistical execution profile

Runtime Support for Adaptive Spatial Partitioning and Inter-Kernel Communication on GPUs

Performance characteristics of virtual switching

A checkpointing and instant-on mechanism for a embedded system based on non-volatile memories

Adaptive Configuration Selection for Power-Constrained Heterogeneous Systems

Analysis and realization of Relaxed Consistency Memory model for multi-core CPU or GPU

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options