Wyniki wyszukiwania

Pozycje od 1 do 20 spośród 303 wyników

Poprzednia

Następna

rozdział

GScheduler: Optimizing resource provision by using GPU usage pattern extraction in cloud environments

Zhuqing Xu, Fang Dong, Jiahui Jin, Junzhou Luo, więcej

2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC) > 3225 - 3230

2017 IEEE International Conference on Systems, Man and Cybernetics (SMC)

GPU-based clusters are widely chosen for accelerating a variety of scientific applications in high-end cloud environments. With their growing popularity, there is a necessity for improving the system throughput and decreasing the turnaround time for co-executing applications on the same GPU device. However, resource contention among multiple applications on a multi-tasked GPU leads to the performance...

rozdział

General-purpose computing on GPU: Pixel processing

Milos Ockay

2017 Communication and Information Technologies (KIT) > 1 - 4

2017 Communication and Information Technologies (KIT)

Presented paper explains general purpose approach to the parallel pixel processing on GPU. It presents essential dataset structuring, correct type assignment and kernel configuration for CUDA application interface. Paper also explains data movement and optimal computation saturation. Transfers are also analyzed in correlation with the computation especially for the embarrassingly parallel problem...

rozdział

Autotuning GPU Kernels via Static and Predictive Analysis

Robert Lim, Boyana Norris, Allen Malony

2017 46th International Conference on Parallel Processing (ICPP) > 523 - 532

2017 46th International Conference on Parallel Processing (ICPP)

Optimizing the performance of GPU kernels is challenging for both human programmers and code generators. For example, CUDA programmers must set thread and block parameters for a kernel, but might not have the intuition to make a good choice. Similarly, compilers can generate working code, but may miss tuning opportunities by not targeting GPU models or performing code transformations. Although empirical...

rozdział

Overlapping Data Transfers with Computation on GPU with Tiles

Burak Bastem, Didem Unat, Weiqun Zhang, Ann Almgren, więcej

2017 46th International Conference on Parallel Processing (ICPP) > 171 - 180

2017 46th International Conference on Parallel Processing (ICPP)

GPUs are employed to accelerate scientific applications however they require much more programming effort from the programmers particularly because of the disjoint address spaces between the host and the device. OpenACC and OpenMP 4.0 provide directive based programming solutions to alleviate the programming burden however synchronous data movement can create a performance bottleneck in fully taking...

rozdział

A CUDA-based parallel adaptive dynamic programming algorithm

Lu Li, Xin Chen, Wei Wang

2017 36th Chinese Control Conference (CCC) > 3510 - 3515

2017 36th Chinese Control Conference (CCC)

Adaptive Dynamic Programming (ADP) with critic-actor architecture is a useful way to achieve online learning control. The algorithm Gaussian-Kernel Adaptive Dynamic Programming (GK-ADP) that has been developed before has a kind of two-phase iteration, which not only approximates value function, but also optimizes hyper-parameters simultaneously. However, just like most iteration algorithms are applied...

rozdział

GPU-based coevolutionary particle swarm optimization

Zhao Liang, Zhu Yanxing, Zhang Jianyu, Ye Zhencheng

2017 36th Chinese Control Conference (CCC) > 9883 - 9887

2017 36th Chinese Control Conference (CCC)

Coevolutionary particle swarm optimization (CPSO) algorithm has been investigated and applied in the real world widely. When tackling the large-scale and complex real time optimization problems, the running time of CPSO algorithm is a barrier. In this paper, Graphics Processing Unit (GPU) is introduced to provide speedup in order to meet the real time requirements. The CPSO algorithm has been implemented...

rozdział

GPU accelerated foreground segmentation using CodeBook model and shadow removal using CUDA

Praveen Gudivaka, Nayaneesh Mishra, Anupam Agrawal

2017 International Conference on Computing, Communication and Automation (ICCCA) > 765 - 770

2017 International Conference on Computing, Communication and Automation (ICCCA)

Background Subtraction is the major important step in many image processing applications which can be applied in much of video surveillances. The major result of this method is accuracy as well as processing time. So we mainly focused on these two challenges. We parallelized the Two Layered CodeBook Model on Graphical Processing Unit (GPU) for increasing the processing speed and the accuracy of the...

rozdział

Performance improvement of CUDA applications by reducing CPU-GPU data transfer overhead

N. V. Sunitha, K. Raju, Niranjan N. Chiplunkar

2017 International Conference on Inventive Communication and Computational Technologies (ICICCT) > 211 - 215

2017 International Conference on Inventive Communication and Computational Technologies (ICICCT)

In a CPU-GPU based heterogeneous computing system, the input data to be processed by the kernel resides in the host memory. The host and the device memory address spaces are different. Therefore, the device can not directly access the host memory. In CUDA programming model, the data is moved between the host memory and the device memory. This data transfer is a time consuming task. The communication...

rozdział

GPU implementation of all pairs shortest path algorithm for graphs using triangular matrix method

S. Umamaheswari, G. Abisheik

2016 Eighth International Conference on Advanced Computing (ICoAC) > 218 - 223

2016 Eighth International Conference on Advanced Computing (ICoAC)

In various applications where the problem domain can be modeled into graphs, the shortest path computation in the graph is an indispensable challenge. In applications like online social networks and shortest route computation problems, the size of the graph is so large; the number of nodes have become close to hundreds of billions. Shortest path graph algorithms like SSSP (Single Source Shortest Path)...

rozdział

To use or not to use: CPUs' cache optimization techniques on GPGPUs

D.R.V.L.B. Thambawita, Roshan G. Ragel, Dhammike Elkaduwe

2016 IEEE International Conference on Information and Automation for Sustainability (ICIAfS) > 1 - 6

2016 IEEE International Conference on Information and Automation for Sustainability (ICIAfS)

General Purpose Graphic Processing Unit(GPGPU) is used widely for achieving high performance or high throughput in parallel programming. This capability of GPGPUs is very famous in the new era and mostly used for scientific computing which requires more processing power than normal personal computers. Therefore, most of the programmers, researchers and industry use this new concept for their work...

rozdział

GPU-Accelerated Solution of Activated Sludge Model's System of ODEs with a High Degree of Stiffness

Jamal Alikhania, Arash Massoudiehb, Ujjal K. Bhowmika

2016 International Conference on Computational Science and Computational Intelligence (CSCI) > 555 - 560

2016 International Conference on Computational Science and Computational Intelligence (CSCI)

Simulation of activated sludge model (ASM) including detailed biokinetic reaction network often requires the solution of a large system of ordinary differential equations (ODEs) at each time frame, which requires long computing times. In this study, an adaptive time step backward differentiation formula (BDF) is proposed to solve the ASM's system of ODEs that mainly contains a high degree of stiffness...

rozdział

Histogram optimization with CUDA

Keh Kok Yong, Sheera Shaheera Othman Talib

2016 IEEE Industrial Electronics and Applications Conference (IEACon) > 312 - 318

2016 IEEE Industrial Electronics and Applications Conference (IEACon)

Histogram is a popular analytic graphical representation of data distribution resulting from processing a given numerical input data. Although the sequential histogram computation may be simple, it is no longer suitable in processing high volume of data. With recent advancement of high performance computing (HPC), aided by the accelerating growth of General Purpose Graphic Processing Unit (GPGPU),...

rozdział

GPU implementation of multi-scale Retinex image enhancement algorithm

Hui Li, Weihao Xie, Xingang Wang, Shousheng Liu, więcej

2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA) > 1 - 5

2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA)

Multi-scale Retinex algorithm is an image enhancement algorithm that aims at image reconstruction. The algorithm maintains the high fidelity and the dynamic range compression of the image, so the enhancement effect is obvious. The algorithm exploits a large number of convolution operations to achieve dynamic range compression and color/brightness rendition, and the calculation time increased significantly...

rozdział

MetaMorph: A Library Framework for Interoperable Kernels on Multi- and Many-Core Clusters

Ahmed E. Helal, Virginia Tech, Paul Sathre, Wu-chun Feng

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 119 - 129

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

To attain scalable performance efficiently, the HPC community expects future exascale systems to consist of multiple nodes, each with different types of hardware accelerators. In addition to GPUs and Intel MICs, additional candidate accelerators include embedded multiprocessors and FPGAs. End users need appropriate tools to efficiently use the available compute resources in such systems, both within...

rozdział

Understanding Error Propagation in GPGPU Applications

Guanpeng Li, Karthik Pattabiraman, Chen-Yang Cher, Pradip Bose

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 240 - 251

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

GPUs have emerged as general-purpose accelerators in high-performance computing (HPC) and scientific applications. However, the reliability characteristics of GPU applications have not been investigated in depth. While error propagation has been extensively investigated for non-GPU applications, GPU applications have a very different programming model which can have a significant effect on error propagation...

rozdział

Satellite image processing on parallel computing: A technical review

Snehal B. Buche, Shweta A. Dhondse, Anand N. Khobragade

2016 Online International Conference on Green Engineering and Technologies (IC-GET) > 1 - 9

2016 Online International Conference on Green Engineering and Technologies (IC-GET)

Image classification is one the important processing done on satellite images. Many algorithm are proposed for such classification of which Support Vector Machine (SVM) is mostly used. Many variants and approaches of SVM are proposed of which GA based classifiers shows better prospects. But increasing size, spectrum and multiple dimension of remote sensing data has made image processing problem more...

rozdział

Parallelization of GST algorithm for source code similarity detection

Marko J. Misic, Dusan V. Nikolov, Jelica Z. Protic, Milo V. Tomasevic

2016 24th Telecommunications Forum (TELFOR) > 1 - 4

2016 24th Telecommunications Forum (TELFOR)

Source code is a frequent target for plagiarism in massive computing courses. Plagiarism detection requires a significant effort from the teaching staff, thus software tools have been used to detect similar source codes. This paper examines parallelization of source code similarity detection based on Greedy-String-Tiling and Karp-Rabin algorithms. CPU implementation is parallelized using Pthreads,...

rozdział

How to Speed Up CUDA-WSat-PcL by 5x

Heng Liu, Arrvindh Shriraman, Evgenia Ternovska

2016 Fourth International Symposium on Computing and Networking (CANDAR) > 462 - 468

2016 Fourth International Symposium on Computing and Networking (CANDAR)

The Propositional Satisfiability Problem (SAT) is one of the most fundamental NP-complete problems, and is central to many domains of computer science. Utilizing a massively parallel architecture on a Graphics Processing Unit (GPU) together with a conventional CPU on NVIDIA's Compute Unified Device Architecture (CUDA) platform, this work proposes an efficient scheme to implement one parallel Stochastic...

rozdział

A Gb/s parallel block-based Viterbi decoder for convolutional codes on GPU

Hao Peng, Rongke Liu, Yi Hou, Ling Zhao

2016 8th International Conference on Wireless Communications & Signal Processing (WCSP) > 1 - 6

2016 8th International Conference on Wireless Communications & Signal Processing (WCSP)

In this paper, we propose a parallel block-based Viterbi decoder (PBVD) on the graphic processing unit (GPU) platform for the decoding of convolutional codes. The decoding procedure is simplified and parallelized, and the characteristic of the trellis is exploited to reduce the metric computation. Based on the compute unified device architecture (CUDA), two kernels with different parallelism are designed...

rozdział

Research of parallel dehazing using temporal coherence algorithm based on CUDA

Yanwen Gu, Xiaogang Zhang

2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC) > 56 - 61

2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC)

It makes the haze removal in real-time by CUDA based on the atmospheric scattering model and temporal coherence algorithm. Firstly, a hierarchical search method based on four fork tree subdivision replaced the original algorithm to obtain the atmospheric light, and put the number of pixels as the number of parallel threads, which processes the required calculation of pixels, the intermediate results...

Poprzednia

Następna

Opcje filtrowania

Słowa kluczowe:
KERNEL
CUDA

Data publikacji

Ustaw własny zakres dat

Dostępność treści

Dostępna (297)
Brak (6)

Słowa kluczowe

INSTRUCTION SETS (164)
GPU (142)
GRAPHICS PROCESSING UNIT (138)
GRAPHICS PROCESSING UNITS (130)
COPROCESSORS (72)
GPGPU (69)
COMPUTER ARCHITECTURE (63)
PARALLEL PROCESSING (58)
COMPUTATIONAL MODELING (56)
COMPUTER GRAPHIC EQUIPMENT (51)
PROGRAMMING (43)
ARRAYS (37)
OPTIMIZATION (34)
YARN (33)
MATHEMATICAL MODEL (26)
ACCELERATION (25)
PERFORMANCE EVALUATION (25)
COMPUTE UNIFIED DEVICE ARCHITECTURE (24)
HARDWARE (24)
MEMORY MANAGEMENT (24)
PARALLEL ARCHITECTURES (24)
COMPUTER GRAPHICS (23)
REGISTERS (22)
LIBRARIES (21)
PARALLEL COMPUTING (21)
ALGORITHM DESIGN AND ANALYSIS (20)
OPENMP (18)
SPARSE MATRICES (17)
SYNCHRONIZATION (17)
VECTORS (17)
CENTRAL PROCESSING UNIT (16)
GRAPHICS (16)
EQUATIONS (15)
OPENCL (15)
THROUGHPUT (15)
RUNTIME (14)
DATA MINING (13)
PARALLEL PROGRAMMING (13)
PARALLEL ALGORITHMS (12)
DATA STRUCTURES (11)
INDEXES (11)
MPI (11)
BENCHMARK TESTING (10)
BANDWIDTH (9)
BIOINFORMATICS (9)
GPU COMPUTING (9)
IMAGE EDGE DETECTION (9)
IMAGE PROCESSING (9)
MULTI-THREADING (9)
MULTICORE PROCESSING (9)
PIXEL (9)
DATA TRANSFER (8)
HISTOGRAMS (8)
MICROPROCESSOR CHIPS (8)
CONVOLUTION (7)
CPU (7)
DECODING (7)
HIGH PERFORMANCE COMPUTING (7)
ITERATIVE METHODS (7)
MATRIX MULTIPLICATION (7)
NVIDIA (7)
REAL-TIME SYSTEMS (7)
SPMV (7)
TRAINING (7)
ENCODING (6)
FEATURE EXTRACTION (6)
GENETIC ALGORITHMS (6)
GRAPHIC PROCESSING UNIT (6)
HEURISTIC ALGORITHMS (6)
IMAGE COLOR ANALYSIS (6)
IMAGE RECONSTRUCTION (6)
MAGNETIC CORES (6)
MESSAGE SYSTEMS (6)
MULTIPROCESSING SYSTEMS (6)
PROGRAM PROCESSORS (6)
RANDOM ACCESS MEMORY (6)
RENDERING (COMPUTER GRAPHICS) (6)
THREE DIMENSIONAL DISPLAYS (6)
APPROXIMATION ALGORITHMS (5)
CLUSTERING ALGORITHMS (5)
COMPUTATIONAL COMPLEXITY (5)
CRYPTOGRAPHY (5)
DATA MODELS (5)
FINITE DIFFERENCE METHODS (5)
GENOMICS (5)
MATHEMATICS COMPUTING (5)
MEDICAL IMAGE PROCESSING (5)
NUMERICAL MODELS (5)
NVIDIA GPU (5)
PARALLEL (5)
PATTERN CLUSTERING (5)
PERFORMANCE ANALYSIS (5)
POWER AWARE COMPUTING (5)
PROTEINS (5)
RADIATION DETECTORS (5)
SHAPE (5)
SHARED MEMORY (5)
TUNING (5)
więcej

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania

Dodaj adresata

Anulowanie wysłania wiadomości

Czy na pewno chcesz anulować wysłanie wiadomości?

Wyślij wiadomość

Opcje filtrowania

Data publikacji

Ustawianie zakresu dat

Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.

Dostępność treści

Słowa kluczowe

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu