Search results

Items from 1 to 20 out of 44 results

chapter

Elliptic Curve point multiplication on GPUs

Samuel Antao, Jean-Claude Bajard, Leonel Sousa

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors > 192 - 199

21st IEEE International Conference on Application-Specific Systems, Architectures and Processors (ASAP 2010)

Acceleration of cryptographic applications on Graphical Processing Units (GPUs) platforms is a research topic with practical interest, because these platforms provide huge computational power for this type of applications. In this paper, we propose a parallel algorithm for Elliptic Curve (EC) point multiplication in order to compute EC cryptography on GPUs. The proposed approach relies in using the...

chapter

Fast GPU-Based Automatic Time Gain Compensation for Ultrasound Imaging

Dan Shi, Zhengjuan Fan, Hao Yin, D C Liu

2010 4th International Conference on Bioinformatics and Biomedical Engineering > 1 - 4

2010 4th International Conference on Bioinformatics and Biomedical Engineering (iCBBE 2010)

The Compute Unified Device Architecture (CUDA) is a new programming platform making use of the unified shader design of the most current Graphics Processing Units (GPUs) from NVIDIA. In this paper, we apply this revolutionary new technology to implement the automatic time gain compensation (ATGC) for medical ultrasound imaging. The parallel box filtering method and general matrix computation algorithms...

chapter

Optimized GPU Framework for Ultrasound B-Mode Imaging

Chunlan Xia, Anyuan Zhao, Dong C Liu

2010 4th International Conference on Bioinformatics and Biomedical Engineering > 1 - 4

2010 4th International Conference on Bioinformatics and Biomedical Engineering (iCBBE 2010)

Ultrasound B-mode imaging is the basic image mode which can offer anatomic information of organs for clinical diagnosis. Because of the massive computation involved in baseband processing from focused radio-frequency (RF) signals followed by envelop detection, compression and scan conversion required for high quality B-mode imaging, existing medical systems always rely on complicated hardware in real...

chapter

High-dimensional planning on the GPU

Joseph T Kider, Mark Henderson, Maxim Likhachev, Alla Safonova

2010 IEEE International Conference on Robotics and Automation > 2515 - 2522

2010 IEEE International Conference on Robotics and Automation (ICRA 2010)

Optimal heuristic searches such as A* search are commonly used for low-dimensional planning such as 2D path finding. These algorithms however, typically do not scale well to high-dimensional planning problems such as motion planning for robotic arms, computing motion trajectories for non-holonomic robotic vehicles and motion synthesis for humanoid characters. A recently developed randomized version...

chapter

Cooperative Multitasking for GPU-Accelerated Grid Systems

Fumihiko Ino, Akihiro Ogita, Kentaro Oita, Kenichi Hagihara

2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing > 774 - 779

2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid)

Exploiting the graphics processing unit (GPU) is useful to obtain higher performance with a less number of host machines in grid systems. One problem in GPU-accelerated grid systems is the lack of efficient multitasking mechanisms. In this paper, we propose a cooperative multitasking method capable of simultaneous execution of a graphics application and a CUDA-based scientific application on a single...

chapter

Asynchronous Communication Schemes for Finite Difference Methods on Multiple GPUs

Daniel Peter Playne, Kenneth Arthur Hawick

2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing > 763 - 768

2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid)

Finite difference methods continue to provide an important and parallelisable approach to many numerical simulations problems. Iterative multigrid and multilevel algorithms can converge faster than ordinary finite difference methods but can be more difficult to parallelise. Data parallel paradigms tend to lend themselves particularly well to solving regular mesh PDEs whereby low latency communications...

chapter

Efficient smart monte carlo based SSTA on graphics processing units with improved resource utilization

Vineeth Veetil, Yung-Hsu Chang, Dennis Sylvester, David Blaauw

Design Automation Conference > 793 - 798

2010 47th ACM/EDAC/IEEE Design Automation Conference (DAC 2010)

To exploit the benefits of throughput-optimized processors such as GPUs, applications need to be redesigned to achieve performance and efficiency. In this work, we present techniques to speed up statistical timing analysis on throughput processors. We draw upon advancements in improving the efficiency of Monte Carlo based statistical static timing analysis (MC SSTA) using techniques to reduce the...

chapter

A GPU-enabled solver for time-constrained linear sum assignment problems

Roberto Roverso, Amgad Naiem, Mohammed El-Beltagy, Sameh El-Ansary, more

2010 The 7th International Conference on Informatics and Systems (INFOS) > 1 - 6

2010 7th International Conference on Informatics and Systems (INFOS 2010)

This paper deals with solving large instances of the Linear Sum Assignment Problems (LSAPs) under realtime constraints, using Graphical Processing Units (GPUs). The motivating scenario is an industrial application for P2P live streaming that is moderated by a central tracker that is periodically solving LSAP instances to optimize the connectivity of thousands of peers. However, our findings are generic...

chapter

A GPU based implementation of Center-Surround Distribution Distance for feature extraction and matching

Aditi Rathi, Michael DeBole, Weina Ge, Robert T Collins, more

2010 Design, Automation&Test in Europe Conference&Exhibition (DATE 2010) > 172 - 177

2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)

The release of general purpose GPU programming environments has garnered universal access to computing performance that was once only available to super-computers. The availability of such computational power has fostered the creation and re-deployment of algorithms, new and old, creating entirely new classes of applications. In this paper, a GPU implementation of the Center-Surround Distribution...

chapter

IP routing processing with graphic processors

Shuai Mu, Xinya Zhang, Nairen Zhang, Jiaxin Lu, more

2010 Design, Automation&Test in Europe Conference&Exhibition (DATE 2010) > 93 - 98

2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)

Throughput and programmability have always been the central, but generally conflicting concerns for modern IP router designs. Current high performance routers depend on proprietary hardware solutions, which make it difficult to adapt to ever-changing network protocols. On the other hand, software routers offer the best flexibility and programmability, but could only achieve a throughput one order...

chapter

GPU support for batch oriented workloads

L.B. Costa, S. Al-Kiswany, M. Ripeanu

2009 IEEE 28th International Performance Computing and Communications Conference > 231 - 238

2009 IEEE 28th International Performance Computing and Communications Conference (IPCCC 2009)

This paper explores the ability to use graphics processing units (GPUs) as co-processors to harness the inherent parallelism of batch operations in systems that require high performance. To this end we have chosen bloom filters (space-efficient data structures that support the probabilistic representation of set membership) as the queries these data structures support are often performed in batches...

chapter

Tiling for Performance Tuning on Different Models of GPUs

Chang Xu, Steven R Kirk, Samantha Jenkins

2009 Second International Symposium on Information Science and Engineering > 500 - 504

Second International Symposium on Information Science and Engineering (ISISE 2009)

The strategy of using CUDA-compatible GPUs as a parallel computation solution to improve the performance of programs has been more and more widely approved during the last two years since the CUDA platform was released. Its benefit extends from the graphic domain to many other computationally intensive domains. Tiling, as the most general and important technique, is widely used for optimization in...

chapter

Formal Description and Optimization Based High - Performance Computing on CUDA

Bo Li, Huacheng Zhao, Jingjing Liang

2009 First International Conference on Information Science and Engineering > 219 - 224

2009 1st International Conference on Information Science and Engineering (ICISE 2009)

In recent years, with the development of GPU, based on the general purpose computation on graphics processors has became a new field. Aiming at the processing of GPU, this paper provides the formal description for data parallel mode, a detailed description of the CUDA programming mode land the principle of optimization. It shows by the comparative experiment that CUDA owns strongly of the ability...

chapter

Scene Recognition Acceleration Using CUDA and OpenMP

Yuxin Wang, Zhen Feng, He Guo, Changqin He, more

2009 First International Conference on Information Science and Engineering > 1422 - 1425

2009 1st International Conference on Information Science and Engineering (ICISE 2009)

Scene recognition has become a remarkable field in image processing area, and many methods have been proposed in recent years, in which the idea of extracting the scene gist from global features has been proved to have higher retrieval accuracy compared with many other methods. However, the process of extracting gist is heavily time-consuming and not suitable for real-time application. In this paper,...

chapter

An Improved Parallel Implementation of 3D DRIE Simulation on GPU

Fan Zhang, Gang Wang, Xiaoguang Liu, Jing Liu

2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks > 747 - 751

2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks (ISPAN 2009)

Deep reactive ion etching (DRIE) technique is a new and powerful tool in Micro-Electro-Mechanical Systems (MEMS) fabrication. A 3D DRIE simulation can help researcher understand the time-evolution of Bosch process used in DRIE. Due to the high complexity of the algorithm used in the simulation, it is necessary to develop an algorithm that can accelerate the simulation. This paper presents a parallel...

chapter

Password Recovery for RAR Files Using CUDA

Guang Hu, Jianhua Ma, Benxiong Huang

2009 Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing > 486 - 490

2009 International Conference on Dependable, Autonomic and Secure Computing (DASC 2009)

Driven by the insatiable demand of real-time graphics, especially from the market of computer games, graphics processing unit (GPU) is becoming a major computing horsepower during recent years since the performance of GPU is surpassing that of the contemporary CPU. This paper presents our study on how to efficiently recover the passwords for encrypted RAR files. Our research focus is on the AES key...

chapter

Multi-scale Modeling for Nano Scale Phenomenon Using CUDA Based Framework

R. Pathak, S. Joshi

2009 International Conference on Advances in Computing, Control, and Telecommunication Technologies > 278 - 283

2009 International Conference on Advances in Computing, Control, & Telecommunication Technologies (ACT 2009)

The essence of high performance computing (HPC) in the field of computation nanotechnology and problems encountered by HPC arrangement in applying HPC to Nano-enabled calculations have been presented in the paper. A proposal to optimize computations in an HPC setup has been formulated to make nanotechnology computations more effective and realistic on a CUDA based framework. Results and findings in...

chapter

FPGA vs. GPU for sparse matrix vector multiply

Yan Zhang, Y.H. Shalabi, R. Jain, K.K. Nagar, more

2009 International Conference on Field-Programmable Technology > 255 - 262

2009 International Conference on Field-Programmable Technology (FPT 2009)

Sparse matrix-vector multiplication (SpMV) is a common operation in numerical linear algebra and is the computational kernel of many scientific applications. It is one of the original and perhaps most studied targets for FPGA acceleration. Despite this, GPUs, which have only recently gained both general-purpose programmability and native support for double precision floating-point arithmetic, are...

chapter

On the Robust Mapping of Dynamic Programming onto a Graphics Processing Unit

Shucai Xiao, A.M. Aji, Wu-chun Feng

2009 15th International Conference on Parallel and Distributed Systems > 26 - 33

2009 IEEE 15th International Conference on Parallel and Distributed Systems (ICPADS 2009)

Graphics processing units (GPUs) have been widely used to accelerate algorithms that exhibit massive data parallelism or task parallelism. When such parallelism is not inherent in an algorithm, computational scientists resort to simply replicating the algorithm on every multiprocessor of a NVIDIA GPU, for example, to create such parallelism, resulting in embarrassingly parallel ensemble runs that...

chapter

Numerical Parallel Processing Based on GPU with CUDA Architecture

Chengming Zou, Chunfen Xia, Guanghui Zhao

2009 International Conference on Wireless Networks and Information Systems > 93 - 96

2009 International Conference on Wireless Networks and Information Systems (WNIS 2009)

The characteristics of modern graphics processing unit (GPU) is programmable, high price / performance ratio and high speed . It has a strong ability to adapt the parallel calculation, Based on this, the article study the general method of GPU calculating and use compute unified device architecture (CUDA) to design new parallel algorithm to accelerate the matrix inversion and binarization algorithm...

Keywords:
YARN

Publication date

Set your own date range

Content availability

Available (43)
None (1)

Keywords

COPROCESSORS (27)
GPU (22)
KERNEL (18)
CUDA (17)
COMPUTER ARCHITECTURE (15)
COMPUTATIONAL MODELING (14)
COMPUTE UNIFIED DEVICE ARCHITECTURE (14)
PARALLEL PROCESSING (13)
COMPUTER GRAPHIC EQUIPMENT (11)
COMPUTER GRAPHICS (11)
PROGRAMMING (10)
GRAPHICS (8)
ALGORITHM DESIGN AND ANALYSIS (6)
CENTRAL PROCESSING UNIT (6)
ACCELERATION (5)
PARALLEL ALGORITHMS (5)
PARALLEL ARCHITECTURES (5)
PIXEL (5)
THROUGHPUT (5)
ARRAYS (4)
CRYPTOGRAPHY (4)
HARDWARE (4)
OPTIMIZATION (4)
PARALLEL ALGORITHM (4)
PARALLEL COMPUTING (4)
BIOLOGICAL SYSTEM MODELING (3)
C LANGUAGE (3)
CPU (3)
DATA MINING (3)
FEATURE EXTRACTION (3)
HEURISTIC ALGORITHMS (3)
IMAGING (3)
INTERPOLATION (3)
LOGIC GATES (3)
MEMORY MANAGEMENT (3)
MULTIPROCESSING SYSTEMS (3)
SYNCHRONIZATION (3)
BIOMEDICAL ULTRASONICS (2)
BLOOM FILTER (2)
CAMERAS (2)
COMPUTER VISION (2)
DIGITAL SIMULATION (2)
FIELD PROGRAMMABLE GATE ARRAYS (2)
FORMAL SPECIFICATION (2)
GPGPU (2)
GPU COMPUTING (2)
GRAPHICAL PROCESSING UNIT (2)
GRAPHICAL PROCESSING UNITS (2)
GRAPHICS PROCESSING UNITS (2)
IMAGE PROCESSING (2)
IMAGE RECONSTRUCTION (2)
LATTICES (2)
MAGNETIC CORES (2)
MEDICAL IMAGE PROCESSING (2)
MONITORING (2)
MONTE CARLO METHODS (2)
OPTIMISATION (2)
PARALLEL (2)
PERFORMANCE (2)
PROGRAM PROCESSORS (2)
REGISTERS (2)
RUNTIME (2)
SEARCH PROBLEMS (2)
TILING (2)
TRAFFIC ENGINEERING COMPUTING (2)
ULTRASONIC IMAGING (2)
UNIFIED SHADER DESIGN (2)
2D PATH FINDING (1)
3D DRIE SIMULATION (1)
6DOF ROBOT ARM PLANAR (1)
A* SEARCH (1)
ACTIVE MEMBRANE SYSTEM (1)
ADAPTATION MODEL (1)
ADDITIONS COMPUTATION (1)
AES (1)
AES KEY GENERATION PROCESSING (1)
AGENT BASED MODELLING (1)
ALGORITHM THEORY (1)
ALGORITHMIC MODEL (1)
ALL-REFLECTIVE FOURIER TRANSFORM IMAGING SPECTROMETER DATA (1)
AMD SEMPRON PROCESSOR LE-1200 CPU (1)
ANTHROPOMORPHIC FEMALE PHANTOM (1)
APPLICATION PROGRAM INTERFACE (1)
APPLICATION PROGRAM INTERFACES (1)
ARFTIS RECONSTRUCTION MODEL (1)
ARTIFICIAL LIFE (1)
ASSOCIATION RULE MINING (1)
ASSOCIATION RULES (1)
ATGC (1)
AUCTIONING METHODS (1)
AUTOMATIC GAIN CONTROL (1)
AUTOMATIC TIME GAIN COMPENSATION (1)
AZIMUTH (1)
BANDWIDTH (1)
BASEBAND PROCESSING (1)
BATCH ORIENTED WORKLOADS (1)
BATCH WORKLOAD (1)
BENCHMARK TESTING (1)
more

INFONA - science communication portal

Search results

Elliptic Curve point multiplication on GPUs

Fast GPU-Based Automatic Time Gain Compensation for Ultrasound Imaging

Optimized GPU Framework for Ultrasound B-Mode Imaging

High-dimensional planning on the GPU

Cooperative Multitasking for GPU-Accelerated Grid Systems

Asynchronous Communication Schemes for Finite Difference Methods on Multiple GPUs

Efficient smart monte carlo based SSTA on graphics processing units with improved resource utilization

A GPU-enabled solver for time-constrained linear sum assignment problems

A GPU based implementation of Center-Surround Distribution Distance for feature extraction and matching

IP routing processing with graphic processors

GPU support for batch oriented workloads

Tiling for Performance Tuning on Different Models of GPUs

Formal Description and Optimization Based High - Performance Computing on CUDA

Scene Recognition Acceleration Using CUDA and OpenMP

An Improved Parallel Implementation of 3D DRIE Simulation on GPU

Password Recovery for RAR Files Using CUDA

Multi-scale Modeling for Nano Scale Phenomenon Using CUDA Based Framework

FPGA vs. GPU for sparse matrix vector multiply

On the Robust Mapping of Dynamic Programming onto a Graphics Processing Unit

Numerical Parallel Processing Based on GPU with CUDA Architecture

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options