The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Various aspects of improving the efficiency of graphical tests used to explore the results of stochastic transformations are discussed in the article. A method to improve performance of stochastic data processing using hybrid computing technologies and comparative analysis of the original tests and the tests with the proposed improvements are presented. Potential areas for improvement of the graphical...
In this paper, we propose and evaluate CUDASankoff, a solution to the RNA structural alignment problem based on the Sankoff algorithm in Graphics Processing Units (GPUs). To our knowledge, this is the first time the Sankoff algorithm is implemented in GPU. In our solution, we show how to linearize the Sankoff 4-dimensional dynamic programming (4D DP) matrix and we propose a two-level wavefront approach...
The article considers one of the most demonstrative graphical methods of quality evaluation of pseudorandom numbers generators. Existing approaches to improve this method are described. The method, which allows increasing of the amount of useful information obtained through testing, is presented. Results of increase in productivity of the test using hybrid computing technologies are considered.
Recent development and popularity of the Graphical Processing Unit (GPU) has attracted researchers to utilize it for error correction codes like Reed Solomon (RS). In this paper, we have proposed an efficient implementation of the RS decoder based on Frequency-Domain analysis. This decoder employs the Finite-field Fast Fourier Transform (FFFT) to convert the received code in Frequency-Domain; the...
GMRCube is a MapReduce based data cube construction model, which utilizes the GPU compute time to reduce its compute time. The model is designed for optimum utilization of the combined GPU-CPU compute capabilities. The paper presents the dataflow of the model, its algorithm along with a detailed explanation. The model was tested on multi dimensional data ranging from 3 to 7 dimensions, and tuples...
Word alignment is a basic task in natural language processing and it usually serves as the starting point when building a modern statistical machine translation system. However, the state-of-art parallel algorithm for word alignment is still time-consuming. In this work, we explore a parallel implementation of word alignment algorithm on Graphics Processor Unit (GPU), which has been widely available...
Accelerator-based heterogeneous computing is of paramount importance to High Performance Computing. The increasing complexity of the cluster architectures requires more generic, high-level programming models. OpenACC is a directive-based parallel programming model, which provides performance on and portability across a wide variety of platforms, including GPU, multicore CPU, and many-core processors...
General Purpose Graphic Processing Unit(GPGPU) is used widely for achieving high performance or high throughput in parallel programming. This capability of GPGPUs is very famous in the new era and mostly used for scientific computing which requires more processing power than normal personal computers. Therefore, most of the programmers, researchers and industry use this new concept for their work...
Simulation of activated sludge model (ASM) including detailed biokinetic reaction network often requires the solution of a large system of ordinary differential equations (ODEs) at each time frame, which requires long computing times. In this study, an adaptive time step backward differentiation formula (BDF) is proposed to solve the ASM's system of ODEs that mainly contains a high degree of stiffness...
Fuzzy hyper-line segment neural network (FHLSNN) is a hybrid system of fuzzy logic and neural network and is used for pattern classification. It learns patterns in terms of n-dimensional hyper line segment (HLS). Modified fuzzy hyperline segment neural network (MFHLSNN) is a modified version of FHLSNN that improves the quality of reasoning and recall time per pattern using modified fuzzy membership...
The recent enhancements in Boolean Satisfiability solving has made SAT solvers a core engine for many real world applications especially for Automatic Test Pattern Generation (ATPG) in digital circuits. The majority of solving time is spent on iteratively propagating variable assignments that are inferred by decisions, so the Unit propagation (UP) is the most significant part in the Satisfiability...
This paper presents GPU parallelization for a computational fluid dynamics solver which works on a mesh consisting of polyhedral cells, where each cell has an arbitrary number of faces and each face has an arbitrary number of vertices. The parallelization is achieved using NVIDIAs compute unified device architecture (CUDA). The developed code specifically targets performance improvement on NVIDIA...
Double-gyre ocean circulation is a typical phenomenon in the northern mid-latitude ocean basins. Its low-frequency variability significantly influences on both ocean and climate. To enhance its predictability, the finding of optimal initial perturbation which can trigger the double-gyre variation is important. CNOP method is adopted to calculate the optimal initial perturbation and this method has...
Hardware accelerators have forced a change in high performance computing. Their use has enabled an increment in the performance of data centers. For this reason, developers have decided to port many applications belonging to diverse science fields, such as biology or chemistry, to hardware accelerators like GPUs (Graphics Processing Units). Nevertheless, not all the applications have been able to...
High-end graphics processing units (GPUs), such as NVIDIA Fermi/Tesla series cards, are widely applied to the high performance computing fields in a decade. NVIDIA releases Tegra K1, called Jetson TK1, which contains 4 ARM Cortex-A15 CPUs and 192 CUDA cores (Kepler GPU) is an embedded board with low cost, low power consumption, and high applicability advantages for several specific applications. In...
Although GPUs are being widely adopted in order to noticeably reduce the execution time of many applications, their use presents several side effects such as an increased acquisition cost of the cluster nodes or an increased overall energy consumption. To address these concerns, GPU virtualization frameworks could be used. These frameworks allow accelerated applications to transparently use GPUs located...
A new scalable parallel math library, dMath, is presented that demonstrates leading scaling when using intranode, internode, and hybrid-parallelism for deep learning (DL). dMath provides easy-to-use distributed primitives and a variety of domain-specific algorithms. These include matrix multiplication, convolutions, and others allowing for rapid development of scalable applications, including Deep...
Apache Spark is a distributed processing framework for large-scale data sets, where intermediate data sets are represented as RDDs (Resilient Distributed Datasets) and stored in memory distributed over machines. To accelerate its various computation intensive operations, such as reduction and sort, we focus on GPU devices. We modified Spark framework to invoke CUDA kernels when computation intensive...
With the introduction of the new NVIDIA Pascal GPU architecture, the need to evaluate its real performance in HPC environments arises. In this paper we briefly present some preliminary results. Compared to its predecessors, the new architecture clearly shows a great improvement.
This paper investigates and studies the acceleration of irregular/regular algorithms via Integrate Graphic Processing Unit (Integrated GPU) known as Accelerated Processing Unit (APU) that is fused on the same die with the CPU, and Discrete Graphic Processing Unit (GPU), while answering the question of How potential is the APU for applications with iregular data structures such as trees knowing that...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.