Search results

Items from 121 to 140 out of 402 results

1 ...
4
5
6
7
8
9
10

chapter

An IRWLS procedure for SVR

F. Perez-Cruz, A. Navia- Vazquez, P. L. Alarcon-Diana, A. Artes-Rodriguez

2000 10th European Signal Processing Conference > 1 - 4

2000 10th European Signal Processing Conference

In this paper we propose an Iterative Re-Weighted Least Square procedure in order to solve the Support Vector Machines for regression and function estimation. Furthermore, we include a new algorithm to train Support Vector Machines, covering both the proposed approach instead of the quadratic programming part and the most advanced methods to deal with large training data sets. Finally, the performance...

chapter

Memory access patterns: the missing piece of the multi-GPU puzzle

Tal Ben-Nun, Ely Levy, Amnon Barak, Eri Rubin

SC15: International Conference for High Performance Computing, Networking, Storage and Analysis > 1 - 12

SC15: International Conference for High Performance Computing, Networking, Storage and Analysis

With the increased popularity of multi-GPU nodes in modern HPC clusters, it is imperative to develop matching programming paradigms for their efficient utilization. In order to take advantage of the local GPUs and the low-latency high-throughput interconnects that link them, programmers need to meticulously adapt parallel applications with respect to load balancing, boundary conditions and device...

chapter

Mercurial: A Traffic-Saving Roll Back System for Virtual Machine Cluster

Bin Shi, Deqing Chen, Lei Cui, Jingsheng Zheng, more

2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing > 877 - 882

2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing (UCC)

Virtual Machine Cluster (VMC) is now widely used to host network applications due to its well scalability and high availability compared to physical cluster. To provide fault tolerance, VMC snapshot is one well known technique, it saves the entire VMC state into stable storage and rollbacks the VM from the latest saved state upon failures. However, due to the large snapshot size as well as numerous...

chapter

Equalizer: Dynamic Tuning of GPU Resources for Efficient Execution

Ankit Sethia, Scott Mahlke

2014 47th Annual IEEE/ACM International Symposium on Microarchitecture > 647 - 658

2014 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)

GPUs use thousands of threads to provide high performance and efficiency. In general, if one thread of a kernel uses one of the resources (compute, bandwidth, data cache) more heavily, there will be significant contention for that resource due to the large number of identical concurrent threads. This contention will eventually saturate the performance of the kernel due to contention for the bottleneck...

chapter

Multi-GPU System Design with Memory Networks

Gwangsun Kim, Minseok Lee, Jiyun Jeong, John Kim

2014 47th Annual IEEE/ACM International Symposium on Microarchitecture > 484 - 495

2014 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)

GPUs are being widely used to accelerate different workloads and multi-GPU systems can provide higher performance with multiple discrete GPUs interconnected together. However, there are two main communication bottlenecks in multi-GPU systems -- accessing remote GPU memory and the communication between GPU and the host CPU. Recent advances in multi-GPU programming, including unified virtual addressing...

chapter

GPU parallelization of the stochastic on-time arrival problem

Maleen Abeydeera, Samitha Samaranayake

2014 21st International Conference on High Performance Computing (HiPC) > 1 - 8

2014 21st International Conference on High Performance Computing (HiPC)

The Stochastic On-Time Arrival (SOTA) problem has recently been studied as an alternative to traditional shortest-path formulations in situations with hard deadlines. The goal is to find a routing strategy that maximizes the probability of reaching the destination within a pre-specified time budget, with the edge weights of the graph being random variables with arbitrary distributions. While this...

chapter

OpenARC: Extensible OpenACC Compiler Framework for Directive-Based Accelerator Programming Study

Seyong Lee, Jeffrey S. Vetter

2014 First Workshop on Accelerator Programming using Directives > 1 - 11

2014 First Workshop on Accelerator Programming using Directives (WACCPD)

Directive-based, accelerator programming models such as OpenACC have arisen as an alternative solution to program emerging Scalable Heterogeneous Computing (SHC) platforms. However, the increased complexity in the SHC systems incurs several challenges in terms of portability and productivity. This paper presents an open-sourced OpenACC compiler, called OpenARC, which serves as an extensible research...

chapter

Performance analysis of the memory management unit under scale-out workloads

Vasileios Karakostas, Osman S. Unsal, Mario Nemirovsky, Adrian Cristal, more

2014 IEEE International Symposium on Workload Characterization (IISWC) > 1 - 12

2014 IEEE International Symposium on Workload Characterization (IISWC)

Much attention has been given to the efficient execution of the scale-out applications that dominate in datacenter computing. However, the effects of the hardware support in the Memory Management Unit (MMU) in combination with the distinct characteristics of the scale-out applications have been largely ignored until recently. In this paper, we comprehensively quantify the MMU overhead on a real machine...

chapter

Improving choice of processes to terminate in Android OS

Shun Nomura, Yuta Nakamura, Hirokazu Sakamoto, Shintaro Hamanaka, more

2014 IEEE 3rd Global Conference on Consumer Electronics (GCCE) > 624 - 625

2014 IEEE 3rd Global Conference on Consumer Electronics (GCCE)

Android operating system, which is the most popular platform for smartphones and tablet computers, has an original memory managing system, which is call “low memory killer”. In case of lacking of memory, Android operating system terminates processes until enough memory is available. It selects targets in order of pre-defined priority and consuming memory size regardless of its re-launching time, its...

chapter

Design of SVM based on radial basis function neural networks pre-partition

Lixin Guan, Weixin Xie, Jihong Pei

2014 12th International Conference on Signal Processing (ICSP) > 1480 - 1483

2014 12th International Conference on Signal Processing (ICSP 2014)

In order to solve the training time problem of the support vector machine for a large dataset, in this paper, an alternative approach motivated by the radial basis function neural network is developed to partition the subset of SVs for the SVM. The proposed method aims at obtain an optimal decision boundary based on the RBFNN, because it has good convergence and fast training. On the other hand, the...

chapter

An improved GPU MapReduce framework for data intensive applications

Razvan Nitu, Elena Apostol, Valentin Cristea

2014 IEEE 10th International Conference on Intelligent Computer Communication and Processing (ICCP) > 355 - 362

2014 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP)

The MapReduce paradigm is one of the best solutions for implementing distributed applications which perform intensive data processing. In terms of performance regarding this type of applications, MapReduce can be improved by adding GPU capabilities. In this context, the GPU clusters for large scale computing can bring a considerable increase in the efficiency and speedup of data intensive applications...

chapter

Grover: Looking for Performance Improvement by Disabling Local Memory Usage in OpenCL Kernels

Jianbin Fang, Henk Sips, Pekka Jaaskelainen, Ana Lucia Varbanescu

2014 43rd International Conference on Parallel Processing > 162 - 171

2014 43nd International Conference on Parallel Processing (ICPP)

Due to the diversity of processor architectures and application memory access patterns, the performance impact of using local memory in OpenCL kernels has become unpredictable. For example, enabling the use of local memory for an OpenCL kernel can be beneficial for the execution on a GPU, but can lead to performance losses when running on a CPU. To address this unpredictability, we propose an empirical...

chapter

CUDSwap: Tolerating Memory Exhaustion Failures in Cloud Computing

Jose Antonio Navas-Molina, Shivakant Mishra

2014 International Conference on Cloud and Autonomic Computing > 15 - 24

2014 International Conference on Cloud and Autonomic Computing (ICCAC)

Cloud computing is now being used by a wide variety of users, ranging from expert programmers and system administrators to scientists and laymen. Cloud providers are taking full advantage of all their resources as much as they can. Memory is the most expensive resource in terms of oversubscription and this has resulted in high price to the end user. Furthermore, performing swapping in Virtual Machines...

chapter

Energy efficient page initialization for storage class memory

Fei Xia, Dejun Jiang, Jin Xiong

2014 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA) > 1 - 6

2014 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA)

Emerging non-volatile memory technologies are promising to serve as storage class memory to replace hard disk and even DRAM. In this paper, we focus on the energy issue of Storage Class Memory (SCM) when one exploits its scalability and near-DRAM latency to provide large-capacity memory system. SCM, such as PCM and RRAM, incurs high write energy and write energy asymmetry. Especially, the energy of...

chapter

Wear-leveling for PCM main memory on embedded system via page management and process scheduling

Chen Pan, Mimi Xie, Jingtong Hu, Meikang Qiu, more

2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications > 1 - 9

2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA)

Phase Change Memory (PCM) has been considered as a leading candidate to replace the traditional DRAM in embedded systems due to its promising characteristics such as low leakage power, low cost, non-volatility, and high scalability. One of the constraints that undermine the credential of PCM as main memory is its limited write endurance. In this paper, we develop wear-leveling techniques purely on...

chapter

A new segmentation-based GPU-accelerated sparse matrix-vector multiplication

Kai He, Sheldon X.-D. Tan, Esteban Tlelo-Cuautle, Hai Wang, more

2014 IEEE 57th International Midwest Symposium on Circuits and Systems (MWSCAS) > 1013 - 1016

2014 IEEE 57th International Midwest Symposium on Circuits and Systems (MWSCAS)

In this paper, we propose a new fast parallel sparse matrix-vector multiplication (SpMV) algorithm on GPU platforms. The new algorithm, called segSpMV, is based on the compressed sparse row (CSR) format and can be applied to wide computational applications with both structured and unstructured matrices. The SpMV operation has very low computing to communication ratio and is bandwidth-limited. The...

chapter

Methods to monitor process's Spatial and temporal consumption

Huiming Jia, Rui Mao, Wenbo Wu

2014 11th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD) > 892 - 897

2014 11th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

Exact values of Spatial and temporal consumption are needed when we are judging the space and time complexities of an algorithm, but few researchers paid attention on whether the their methods were valid. In this paper, we discussed about some key concepts involved in the process of monitoring process's spatial and temporal consumption, and then we explained and distinguished those concepts. Further,...

chapter

Formulating Optimized Storage and Memory Space Specifications for Linux Network Embedded Systems

Kleomenis Tsiligkos, Apostolos Meliones

2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS) > 580 - 584

2014 IEEE International Conference on High Performance Computing and Communications (HPCC), 2014 IEEE 6th International Symposium on Cyberspace Safety and Security (CSS) and 2014 IEEE 11th International Conference on Embedded Software and Systems (ICESS)

Embedded systems are constantly becoming more complex, as they are increasingly equipped with more functionality. Networking capability is one of the most desired features even for embedded systems, hence network applications, typically used in desktop systems, are required to become available in the embedded system domain. Rewriting these applications to fit into embedded root file systems takes...

chapter

A Performance Prediction Model for Memory-Intensive GPU Kernels

Zhidan Hu, Guangming Liu, Zhidan Hu

2014 IEEE Symposium on Computer Applications and Communications > 14 - 18

2014 IEEE Symposium on Computer Applications and Communications (SCAC)

Commodity graphic processing units (GPUs) have rapidly evolved to become high performance accelerators for data-parallel computing through a large array of processing cores and the CUDA programming model with a C-like interface. However, optimizing an application for maximum performance based on the GPU architecture is not a trivial task for the tremendous change from conventional multi-core to the...

chapter

An introduction to the max-plus projection autoassociative morphological memory and some of its variations

Marcos Eduardo Valle

2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) > 53 - 60

2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)

In this paper, we present a novel lattice-based memory model called max-plus projection autoassociative morphological memory (max-plus PAMM). The max-plus PAMM yields the largest max-plus combination of the stored patterns which is less than or equal to the input. Such as the original autoassociative morphological memories (AMMs), it is idempotent and it gives perfect recall of undistorted patterns...

1 ...
4
5
6
7
8
9
10

Keywords:
KERNEL
MEMORY MANAGEMENT

Publication date

Set your own date range

Content availability

Available (398)
None (4)

Keywords

INSTRUCTION SETS (81)
LINUX (80)
HARDWARE (74)
GRAPHICS PROCESSING UNITS (58)
RANDOM ACCESS MEMORY (55)
RESOURCE MANAGEMENT (50)
GRAPHICS PROCESSING UNIT (45)
GPU (42)
BENCHMARK TESTING (38)
PARALLEL PROCESSING (38)
BANDWIDTH (37)
SERVERS (36)
OPTIMIZATION (33)
LIBRARIES (28)
ARRAYS (27)
STORAGE MANAGEMENT (25)
COMPUTATIONAL MODELING (24)
CUDA (24)
REGISTERS (24)
FIELD PROGRAMMABLE GATE ARRAYS (23)
PROGRAMMING (23)
RUNTIME (21)
OPERATING SYSTEM (20)
OPERATING SYSTEMS (COMPUTERS) (19)
PERFORMANCE EVALUATION (19)
VIRTUAL MACHINING (19)
EMBEDDED SYSTEMS (18)
SECURITY (18)
COPROCESSORS (17)
OPERATING SYSTEMS (17)
PROTOCOLS (17)
ALGORITHM DESIGN AND ANALYSIS (16)
MONITORING (16)
OPERATING SYSTEM KERNELS (16)
COMPUTER GRAPHIC EQUIPMENT (15)
DATA STRUCTURES (15)
MULTIPROCESSING SYSTEMS (15)
PROGRAM PROCESSORS (15)
SYNCHRONIZATION (15)
THROUGHPUT (14)
VIRTUAL MACHINE MONITORS (14)
INDEXES (13)
OPENCL (13)
VIRTUAL MACHINES (13)
ACCELERATION (12)
DATA MINING (12)
PARALLEL PROGRAMMING (12)
VIRTUALIZATION (12)
CACHE STORAGE (11)
CLOUD COMPUTING (11)
GPGPU (11)
IMAGE PROCESSING (11)
PREFETCHING (11)
REAL TIME SYSTEMS (11)
YARN (11)
COMPUTE UNIFIED DEVICE ARCHITECTURE (10)
FPGA (10)
MULTICORE PROCESSING (10)
NONVOLATILE MEMORY (10)
RADIATION DETECTORS (10)
CONVOLUTION (9)
DRIVER CIRCUITS (9)
POWER DEMAND (9)
RELIABILITY (9)
STREAMING MEDIA (9)
VECTORS (9)
EQUATIONS (8)
HIGH PERFORMANCE COMPUTING (8)
LATTICES (8)
REAL-TIME SYSTEMS (8)
SCALABILITY (8)
SUPPORT VECTOR MACHINES (8)
SYSTEM-ON-A-CHIP (8)
TRAINING (8)
VIRTUAL MACHINE (8)
COMPUTER ARCHITECTURE (7)
COMPUTER GRAPHICS (7)
DATA TRANSFER (7)
MEMORY (7)
MEMORY ARCHITECTURE (7)
NEURAL NETWORKS (7)
PIXEL (7)
PROCESSOR SCHEDULING (7)
RECONFIGURABLE ARCHITECTURES (7)
SCHEDULES (7)
SWITCHES (7)
ACCURACY (6)
APPLICATION PROGRAM INTERFACES (6)
CLOCKS (6)
COMPLEXITY THEORY (6)
CONTEXT (6)
DATABASES (6)
DIGITAL SIGNAL PROCESSING (6)
ENERGY CONSUMPTION (6)
GRAPHICS (6)
INSTRUMENTS (6)
INTERNET (6)
LOGIC GATES (6)
more

INFONA - science communication portal

Search results

An IRWLS procedure for SVR

Memory access patterns: the missing piece of the multi-GPU puzzle

Mercurial: A Traffic-Saving Roll Back System for Virtual Machine Cluster

Equalizer: Dynamic Tuning of GPU Resources for Efficient Execution

Multi-GPU System Design with Memory Networks

GPU parallelization of the stochastic on-time arrival problem

OpenARC: Extensible OpenACC Compiler Framework for Directive-Based Accelerator Programming Study

Performance analysis of the memory management unit under scale-out workloads

Improving choice of processes to terminate in Android OS

Design of SVM based on radial basis function neural networks pre-partition

An improved GPU MapReduce framework for data intensive applications

Grover: Looking for Performance Improvement by Disabling Local Memory Usage in OpenCL Kernels

CUDSwap: Tolerating Memory Exhaustion Failures in Cloud Computing

Energy efficient page initialization for storage class memory

Wear-leveling for PCM main memory on embedded system via page management and process scheduling

A new segmentation-based GPU-accelerated sparse matrix-vector multiplication

Methods to monitor process's Spatial and temporal consumption

Formulating Optimized Storage and Memory Space Specifications for Linux Network Embedded Systems

A Performance Prediction Model for Memory-Intensive GPU Kernels

An introduction to the max-plus projection autoassociative morphological memory and some of its variations

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options