Search results

Items from 1 to 20 out of 69 results

chapter

Implementation and Performance of a GPU-Based Monte-Carlo Framework for Determining Design Ice Load

Sara Ayubian, Shadi Alawneh, Martin Richard, Jan Thij ssen

2017 International Conference on High Performance Computing & Simulation (HPCS) > 109 - 116

2017 International Conference on High Performance Computing & Simulation (HPCS)

Modern Graphics Processing Units (GPUs) with massive number of threads and many-core architecture support both graphics and general purpose computing. NVIDIA's compute unified device architecture (CUDA) takes advantage of parallel computing and utilizes the tremendous power of GPUs. The present study demonstrates a high performance computing (HPC) framework for a Monte-Carlo simulation to determine...

chapter

A roadmap of parallel sorting algorithms using GPU computing

Neetu Faujdar, Shipra Saraswat

2017 International Conference on Computing, Communication and Automation (ICCCA) > 736 - 741

2017 International Conference on Computing, Communication and Automation (ICCCA)

In today's world, sorting is a basic need and appropriate method starts with searching. Several sorting algorithms has been developed on CPU (Central Processing Unit). But according to current scenario, CPU is not so efficient in sorting. To get the more speedup of sorting algorithms parallelization should b e done. There are many ways of Parallelizing sorting methods which can be performed by using...

chapter

Performance improvement of CUDA applications by reducing CPU-GPU data transfer overhead

N. V. Sunitha, K. Raju, Niranjan N. Chiplunkar

2017 International Conference on Inventive Communication and Computational Technologies (ICICCT) > 211 - 215

2017 International Conference on Inventive Communication and Computational Technologies (ICICCT)

In a CPU-GPU based heterogeneous computing system, the input data to be processed by the kernel resides in the host memory. The host and the device memory address spaces are different. Therefore, the device can not directly access the host memory. In CUDA programming model, the data is moved between the host memory and the device memory. This data transfer is a time consuming task. The communication...

chapter

Performance analysis of basic image processing algorithms on GPU

Vigneswaran Saahithyan, Somaskandan Suthakar

2017 International Conference on Inventive Systems and Control (ICISC) > 1 - 6

2017 International Conference on Inventive Systems and Control (ICISC)

Image processing could be done in CPU or in Graphical Processing Unit (GPU), using sequential programming or parallel programming respectively. Sequential and parallel programming are good in their own paradigm. This paper analyses the performances of various basic image processing algorithms on GPU as well as CPU. Various images with a range of dimensions have been used for the testing purpose. The...

chapter

Parallel implementation of Sobel filter using CUDA

Hana Ben Fredj, Mouna Ltaif, Anis Ammar, Chokri Souani

2017 International Conference on Control, Automation and Diagnosis (ICCAD) > 209 - 212

2017 International Conference on Control, Automation and Diagnosis (ICCAD)

Efficient solutions must be considered, in order to solve the problem of intensive computing of the image processing applications and to achieve high real-time performance. The graphics processing unit (GPU) is an effective and the most recent method used for accelerating extensive calculation algorithms to reduce the execution time by exploiting the power of parallel programming techniques and to...

chapter

GPU implementation of all pairs shortest path algorithm for graphs using triangular matrix method

S. Umamaheswari, G. Abisheik

2016 Eighth International Conference on Advanced Computing (ICoAC) > 218 - 223

2016 Eighth International Conference on Advanced Computing (ICoAC)

In various applications where the problem domain can be modeled into graphs, the shortest path computation in the graph is an indispensable challenge. In applications like online social networks and shortest route computation problems, the size of the graph is so large; the number of nodes have become close to hundreds of billions. Shortest path graph algorithms like SSSP (Single Source Shortest Path)...

chapter

Pattern classification using updated fuzzy hyper-line segment neural network and it's GPU parallel implementation for large datasets using CUDA

Priyadarshan Dhabe, Prashant Vyas, Devrat Ganeriwal, Aditya Pathak

2016 International Conference on Computing, Analytics and Security Trends (CAST) > 24 - 29

2016 International Conference on Computing, Analytics and Security Trends (CAST)

Fuzzy hyper-line segment neural network (FHLSNN) is a hybrid system of fuzzy logic and neural network and is used for pattern classification. It learns patterns in terms of n-dimensional hyper line segment (HLS). Modified fuzzy hyperline segment neural network (MFHLSNN) is a modified version of FHLSNN that improves the quality of reasoning and recall time per pattern using modified fuzzy membership...

chapter

Accelerated Processing Unit (APU) potential: N-body simulation case study

Hassan Youness, Mohamed Moness, Omar Shaaban, Aziza I. Hussein

2016 11th International Conference on Computer Engineering & Systems (ICCES) > 110 - 115

2016 11th International Conference on Computer Engineering & Systems (ICCES)

This paper investigates and studies the acceleration of irregular/regular algorithms via Integrate Graphic Processing Unit (Integrated GPU) known as Accelerated Processing Unit (APU) that is fused on the same die with the CPU, and Discrete Graphic Processing Unit (GPU), while answering the question of How potential is the APU for applications with iregular data structures such as trees knowing that...

chapter

Understanding Error Propagation in GPGPU Applications

Guanpeng Li, Karthik Pattabiraman, Chen-Yang Cher, Pradip Bose

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 240 - 251

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

GPUs have emerged as general-purpose accelerators in high-performance computing (HPC) and scientific applications. However, the reliability characteristics of GPU applications have not been investigated in depth. While error propagation has been extensively investigated for non-GPU applications, GPU applications have a very different programming model which can have a significant effect on error propagation...

chapter

Efficient implementation of sobel filter based on GPUs cards

Mouna Afif, Yahia Said, Haythem Bahri, Mohamed Atri

2016 International Image Processing, Applications and Systems (IPAS) > 1 - 4

2016 International Image Processing, Applications and Systems (IPAS)

The Graphics processors or GPUs have become in a few years powerful tools for applications that require a massively parallel computing. Currently include the applications in multimedia processing, the engineering science and image processing in real time. They offer many advantages such as acceleration of treatment and down energy consumption from an equivalent CPU power. In this paper, we will show...

article

Multilayer Packet Classification With Graphics Processing Units

Matteo Varvello, Rafael Laufer, Feixiong Zhang, T. V. Lakshman

IEEE/ACM Transactions on Networking > 2016 > 24 > 5 > 2728 - 2741

The rapid growth of server virtualization has ignited a wide adoption of software-based virtual switches, with significant interest in speeding up their performance. In a similar trend, software-defined networking (SDN), with its strong reliance on rule-based flow classification, has also created renewed interest in multi-dimensional packet classification. However, despite these recent advances, the...

chapter

CUDA implementation of an optimal online Gaussian-Signal-in-Gaussian-Noise detector

Nir Nossenson, Ariel J. Jaffe

2016 IEEE High Performance Extreme Computing Conference (HPEC) > 1 - 7

2016 IEEE High Performance Extreme Computing Conference (HPEC)

We address the computationally demanding task of real time optimal detection of a Gaussian Signal in Gaussian Noise. The mathematical principles of such a detector were formulated in 1965, but a full real-time implementation of these principles was not possible for decades mainly due to technological barriers. We present a CUDA based implementation of such an optimal detector and study its decision...

chapter

Generalized and hybrid fast-ICA implementation using GPU

Titus Nanda Kumara, Hasindu Gamaarachchi, Geesara Prathap, Roshan Ragel

2016 Sixteenth International Conference on Advances in ICT for Emerging Regions (ICTer) > 13 - 20

2016 Sixteenth International Conference on Advances in ICT for Emerging Regions (ICTer)

Independent Component Analysis is proposed as a solution to the Blind Source Separation problem. Among many of its realizations such as Infomax-ICA, Fast-ICA, and EASI- ICA, the Fast-ICA algorithm is the most famous and considered to be computationally the most efficient. Although the most capable, Fast-ICA still consumes a considerable amount of time on CPUs in real world implementations. Therefore,...

chapter

CUDA Acceleration for AVS2 Loop Filtering

Songyi Li, Ronggang Wang, Kaili Yao

2016 IEEE Second International Conference on Multimedia Big Data (BigMM) > 246 - 250

2016 IEEE Second International Conference on Multimedia Big Data (BigMM)

Parallel computing platforms integrating CPU cores and mass of GPU accelerators have established in several application domains, obtaining remarkable time saving. In this way, video decoders can exploit a broader design space, to take full advantages of the hybrid GPU and CPU computing framework. Several novel contributions that aim at the exploitation of the maximum parallelism level in an AVS2 filtering...

chapter

Real-time CPU-GPU demodulator for the LTE physical layer

Ouajdi Brini, Mounir Boukadoum

2016 IEEE 7th Latin American Symposium on Circuits & Systems (LASCAS) > 155 - 158

2016 IEEE 7th Latin American Symposium on Circuits & Systems (LASCAS)

Since the emergence of large public networks in the 80's, wireless communication protocols have been evolving constantly, forcing frequent changes to the hardware of base stations. This has triggered a lot of research about implementing network functions in software, especially those of the physical layer, in order to allow the use of generic processors in the base stations. However, achieving this...

article

Energy Efficient Iris Recognition With Graphics Processing Units

Ryan Rakvic, Randy Broussard, Hau Ngo

IEEE Access > 2016 > 4 > 2831 - 2839

For the past 40 years, Moore’s law has predicted the rapid growth of the computer industry. In the past few years, however, this growth has slowed for central processing units (CPUs). Instead, there has been a shift to multicore computing, specifically with the general purpose graphic processing units (GPUs). Conventional CPUs have between two and eight cores, but the GPUs can have hundreds, even...

chapter

A General Accelerated R Package Using GPU

Jie Huang, Bojin Zhuang, Fei Su

2015 IEEE International Conference on Smart City/SocialCom/SustainCom (SmartCity) > 605 - 608

2015 IEEE International Conference on Smart City/SocialCom/SustainCom (SmartCity)

In this paper, we present cuLib, a R package that provides an easy-to-access interface for utilizing the computing power of NVIDIA GPU. The cuLib package aims to make GPU-based parallel programming easier, flexible and high-performance. It allows the use of GPU computing in R without further knowledge because the syntax for definition and manipulation of GPU data is similar to formal R language. cuLib...

chapter

GPU acceleration of real time Viola-Jones face detection

Adrian Wong Yoong Wai, Shahirina Mohd Tahir, Yoong Choon Chang

2015 IEEE International Conference on Control System, Computing and Engineering (ICCSCE) > 183 - 188

2015 IEEE International Conference on Control System, Computing and Engineering (ICCSCE)

Face detection is a stepping stone to all facial processing systems such as face recognition with the task of determining face region from the input frame for applications like surveillance and law enforcement. However, face detection is a computational expensive process and thus, with acceleration it can influence the performance of the system. The latest Graphics Processing Unit (GPU) technology...

chapter

Join algorithms on GPUs: A revisit after seven years

Ran Rui, Hao Li, Yi-Cheng Tu

2015 IEEE International Conference on Big Data (Big Data) > 2541 - 2550

2015 IEEE International Conference on Big Data (Big Data)

Implementing database operations on parallel platforms has gain a lot of momentum in the past decade. A number of studies have shown the potential of using GPUs to speed up database operations. In this paper, we present empirical evaluations of a state-of-the-art work published in SIGMOD'08 on GPU-based join processing. In particular, this work presents four major join algorithms and a number of join-related...

chapter

CPU+GPU Programming of Stencil Computations for Resource-Efficient Use of GPU Clusters

Mohammed Sourouri, Johannes Langguth, Filippo Spiga, Scott B. Baden, more

2015 IEEE 18th International Conference on Computational Science and Engineering > 17 - 26

2015 IEEE 18th International Conference on Computational Science and Engineering (CSE)

On modern GPU clusters, the role of the CPUs is often restricted to controlling the GPUs and handling MPI communication. The unused computing power of the CPUs, however, can be considerable for computations whose performance is bounded by memory traffic. This paper investigates the challenges of simultaneous usage of CPUs and GPUs for computation. Our emphasis is on deriving a heterogeneous CPU+GPU...

Keywords:
CENTRAL PROCESSING UNIT

Publication date

Set your own date range

Publication type

book (63)
article (6)

Keywords

GRAPHICS PROCESSING UNITS (38)
GPU (33)
GRAPHICS PROCESSING UNIT (28)
INSTRUCTION SETS (21)
KERNEL (19)
COMPUTER ARCHITECTURE (18)
PARALLEL PROCESSING (16)
GPGPU (15)
COPROCESSORS (14)
ACCELERATION (10)
COMPUTER GRAPHIC EQUIPMENT (10)
ALGORITHM DESIGN AND ANALYSIS (9)
COMPUTATIONAL MODELING (9)
HARDWARE (9)
CPU (6)
GRAPHICS (6)
LIBRARIES (6)
PARALLEL COMPUTING (6)
PROGRAMMING (6)
YARN (6)
COMPUTER GRAPHICS (5)
CONCURRENT COMPUTING (5)
GPU COMPUTING (5)
MATHEMATICAL MODEL (5)
COMPUTE UNIFIED DEVICE ARCHITECTURE (4)
CRYPTOGRAPHY (4)
DATA MINING (4)
DATABASES (4)
OPENMP (4)
PARALLEL ARCHITECTURES (4)
PERFORMANCE EVALUATION (4)
RANDOM ACCESS MEMORY (4)
DECODING (3)
DETECTORS (3)
GRAPHICAL PROCESSING UNIT (3)
GRAPHICS PROCESSING UNIT (GPU) (3)
IMAGE RECONSTRUCTION (3)
MULTICORE PROCESSING (3)
OPENCL (3)
PARALLEL ALGORITHM (3)
PARALLEL ALGORITHMS (3)
SIGNAL PROCESSING (3)
THROUGHPUT (3)
APPLICATION SOFTWARE (2)
BENCHMARK TESTING (2)
BIOINFORMATICS (2)
CLASSIFICATION ALGORITHMS (2)
CLUSTERING ALGORITHMS (2)
COMPACTION (2)
COMPUTATIONAL COMPLEXITY (2)
COMPUTER GAMES (2)
COMPUTER LANGUAGES (2)
CORRELATION (2)
CORRELATION METHODS (2)
DATA STRUCTURES (2)
DIGITAL SIGNAL PROCESSING (2)
EQUATIONS (2)
FINITE DIFFERENCE METHODS (2)
FINITE DIFFERENCE TIME-DOMAIN ANALYSIS (2)
GAMES (2)
GRAPHIC PROCESSING UNIT (2)
IMAGE CODING (2)
IMAGE PROCESSING (2)
MEMORY MANAGEMENT (2)
MULTI-THREADING (2)
MULTICORE CPU (2)
PARALLEL (2)
PARALLEL COMPUTATION (2)
PARALLEL PROGRAMMING (2)
PARTICLE SWARM OPTIMIZATION (2)
PATTERN CLASSIFICATION (2)
PEARSON CORRELATION COEFFICIENT (2)
PIPELINES (2)
POSITRON EMISSION TOMOGRAPHY (2)
POWER DEMAND (2)
PROGRAM PROCESSORS (2)
REAL TIME SYSTEMS (2)
REAL-TIME (2)
REAL-TIME SYSTEMS (2)
RENDERING (COMPUTER GRAPHICS) (2)
ROBOTS (2)
SOFTWARE ALGORITHMS (2)
SPARSE MATRICES (2)
TRANSFORMS (2)
VECTORS (2)
3-D GAMING INDUSTRY (1)
3D RECONSTRUCTION (1)
3D VIDEO CONFERENCE SYSTEMS (1)
4G (1)
5G (1)
ACCELERATED FAST-ICA (1)
ACCELERATION TECHNIQUE (1)
ACCURACY (1)
ACTIVE MEMBRANE SYSTEM (1)
ADVANCED ENCRYPTION STANDARD (1)
AES (1)
AGENT-BASED (1)
AHO-CORASICK ALGORITHM (1)
more

INFONA - science communication portal

Search results

Implementation and Performance of a GPU-Based Monte-Carlo Framework for Determining Design Ice Load

A roadmap of parallel sorting algorithms using GPU computing

Performance improvement of CUDA applications by reducing CPU-GPU data transfer overhead

Performance analysis of basic image processing algorithms on GPU

Parallel implementation of Sobel filter using CUDA

GPU implementation of all pairs shortest path algorithm for graphs using triangular matrix method

Pattern classification using updated fuzzy hyper-line segment neural network and it's GPU parallel implementation for large datasets using CUDA

Accelerated Processing Unit (APU) potential: N-body simulation case study

Understanding Error Propagation in GPGPU Applications

Efficient implementation of sobel filter based on GPUs cards

Multilayer Packet Classification With Graphics Processing Units

CUDA implementation of an optimal online Gaussian-Signal-in-Gaussian-Noise detector

Generalized and hybrid fast-ICA implementation using GPU

CUDA Acceleration for AVS2 Loop Filtering

Real-time CPU-GPU demodulator for the LTE physical layer

Energy Efficient Iris Recognition With Graphics Processing Units

A General Accelerated R Package Using GPU

GPU acceleration of real time Viola-Jones face detection

Join algorithms on GPUs: A revisit after seven years

CPU+GPU Programming of Stencil Computations for Resource-Efficient Use of GPU Clusters

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options