The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Modern Graphics Processing Units (GPUs) with massive number of threads and many-core architecture support both graphics and general purpose computing. NVIDIA's compute unified device architecture (CUDA) takes advantage of parallel computing and utilizes the tremendous power of GPUs. The present study demonstrates a high performance computing (HPC) framework for a Monte-Carlo simulation to determine...
In today's world, sorting is a basic need and appropriate method starts with searching. Several sorting algorithms has been developed on CPU (Central Processing Unit). But according to current scenario, CPU is not so efficient in sorting. To get the more speedup of sorting algorithms parallelization should b e done. There are many ways of Parallelizing sorting methods which can be performed by using...
In a CPU-GPU based heterogeneous computing system, the input data to be processed by the kernel resides in the host memory. The host and the device memory address spaces are different. Therefore, the device can not directly access the host memory. In CUDA programming model, the data is moved between the host memory and the device memory. This data transfer is a time consuming task. The communication...
Image processing could be done in CPU or in Graphical Processing Unit (GPU), using sequential programming or parallel programming respectively. Sequential and parallel programming are good in their own paradigm. This paper analyses the performances of various basic image processing algorithms on GPU as well as CPU. Various images with a range of dimensions have been used for the testing purpose. The...
Efficient solutions must be considered, in order to solve the problem of intensive computing of the image processing applications and to achieve high real-time performance. The graphics processing unit (GPU) is an effective and the most recent method used for accelerating extensive calculation algorithms to reduce the execution time by exploiting the power of parallel programming techniques and to...
In various applications where the problem domain can be modeled into graphs, the shortest path computation in the graph is an indispensable challenge. In applications like online social networks and shortest route computation problems, the size of the graph is so large; the number of nodes have become close to hundreds of billions. Shortest path graph algorithms like SSSP (Single Source Shortest Path)...
Fuzzy hyper-line segment neural network (FHLSNN) is a hybrid system of fuzzy logic and neural network and is used for pattern classification. It learns patterns in terms of n-dimensional hyper line segment (HLS). Modified fuzzy hyperline segment neural network (MFHLSNN) is a modified version of FHLSNN that improves the quality of reasoning and recall time per pattern using modified fuzzy membership...
This paper investigates and studies the acceleration of irregular/regular algorithms via Integrate Graphic Processing Unit (Integrated GPU) known as Accelerated Processing Unit (APU) that is fused on the same die with the CPU, and Discrete Graphic Processing Unit (GPU), while answering the question of How potential is the APU for applications with iregular data structures such as trees knowing that...
GPUs have emerged as general-purpose accelerators in high-performance computing (HPC) and scientific applications. However, the reliability characteristics of GPU applications have not been investigated in depth. While error propagation has been extensively investigated for non-GPU applications, GPU applications have a very different programming model which can have a significant effect on error propagation...
The Graphics processors or GPUs have become in a few years powerful tools for applications that require a massively parallel computing. Currently include the applications in multimedia processing, the engineering science and image processing in real time. They offer many advantages such as acceleration of treatment and down energy consumption from an equivalent CPU power. In this paper, we will show...
The rapid growth of server virtualization has ignited a wide adoption of software-based virtual switches, with significant interest in speeding up their performance. In a similar trend, software-defined networking (SDN), with its strong reliance on rule-based flow classification, has also created renewed interest in multi-dimensional packet classification. However, despite these recent advances, the...
We address the computationally demanding task of real time optimal detection of a Gaussian Signal in Gaussian Noise. The mathematical principles of such a detector were formulated in 1965, but a full real-time implementation of these principles was not possible for decades mainly due to technological barriers. We present a CUDA based implementation of such an optimal detector and study its decision...
Independent Component Analysis is proposed as a solution to the Blind Source Separation problem. Among many of its realizations such as Infomax-ICA, Fast-ICA, and EASI- ICA, the Fast-ICA algorithm is the most famous and considered to be computationally the most efficient. Although the most capable, Fast-ICA still consumes a considerable amount of time on CPUs in real world implementations. Therefore,...
Parallel computing platforms integrating CPU cores and mass of GPU accelerators have established in several application domains, obtaining remarkable time saving. In this way, video decoders can exploit a broader design space, to take full advantages of the hybrid GPU and CPU computing framework. Several novel contributions that aim at the exploitation of the maximum parallelism level in an AVS2 filtering...
Since the emergence of large public networks in the 80's, wireless communication protocols have been evolving constantly, forcing frequent changes to the hardware of base stations. This has triggered a lot of research about implementing network functions in software, especially those of the physical layer, in order to allow the use of generic processors in the base stations. However, achieving this...
For the past 40 years, Moore’s law has predicted the rapid growth of the computer industry. In the past few years, however, this growth has slowed for central processing units (CPUs). Instead, there has been a shift to multicore computing, specifically with the general purpose graphic processing units (GPUs). Conventional CPUs have between two and eight cores, but the GPUs can have hundreds, even...
In this paper, we present cuLib, a R package that provides an easy-to-access interface for utilizing the computing power of NVIDIA GPU. The cuLib package aims to make GPU-based parallel programming easier, flexible and high-performance. It allows the use of GPU computing in R without further knowledge because the syntax for definition and manipulation of GPU data is similar to formal R language. cuLib...
Face detection is a stepping stone to all facial processing systems such as face recognition with the task of determining face region from the input frame for applications like surveillance and law enforcement. However, face detection is a computational expensive process and thus, with acceleration it can influence the performance of the system. The latest Graphics Processing Unit (GPU) technology...
Implementing database operations on parallel platforms has gain a lot of momentum in the past decade. A number of studies have shown the potential of using GPUs to speed up database operations. In this paper, we present empirical evaluations of a state-of-the-art work published in SIGMOD'08 on GPU-based join processing. In particular, this work presents four major join algorithms and a number of join-related...
On modern GPU clusters, the role of the CPUs is often restricted to controlling the GPUs and handling MPI communication. The unused computing power of the CPUs, however, can be considerable for computations whose performance is bounded by memory traffic. This paper investigates the challenges of simultaneous usage of CPUs and GPUs for computation. Our emphasis is on deriving a heterogeneous CPU+GPU...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.