The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Convolutional Neural Networks (CNNs) are multi-layer deep structures that have been very successful in visual recognition tasks. These networks basically consist of the convolution, pooling, and the nonlinearity layers, each of which operates on the representation produced by the preceding layer and generates a new representation. Convolution layers naturally compute some inner product between a plane...
Blood simulation is an important part in the virtual surgery training system. However, the huge computational complexity and authenticity of blood simulation is of great challenge to the surgical training system. In this paper, a simulation method based on GPU-accelerated is used for blood simulation in surgical training system. The grid method is used to divide the target area, create space grid...
Fuzzy hyper-line segment neural network (FHLSNN) is a hybrid system of fuzzy logic and neural network and is used for pattern classification. It learns patterns in terms of n-dimensional hyper line segment (HLS). Modified fuzzy hyperline segment neural network (MFHLSNN) is a modified version of FHLSNN that improves the quality of reasoning and recall time per pattern using modified fuzzy membership...
In modern networks, there exist different applications which generate various different types of network traffic. In order to improve the performance of network management, it is important to identify and classify the internet traffic. The machine learning (ML) technique based on per-flow statistics has been widely used in traffic classification. Different from traditional classification methods,...
Parallel computing is a simultaneous use of multiple compute resources, for example, processors to solve complex computational problems. It has been used in high-end computing areas such as pattern recognition, medical diagnosis, national defense, and web search engine. This paper focuses on the implementation of pattern classification technique, Support Vector Machine (SVM) using vector processor...
The general classification is a machine learning task that tries to assign the best class to a given unknown input vector based on past observations (training data). Most of developed algorithms are very time consuming for large datasets (Support Vector Machine, Deep Neural Networks, etc.). Extreme Learning Machine (ELM) is a high quality classification algorithm that gains much popularity in recent...
General Purpose computing on Graphics Processor Units (GPGPU) brings massively parallel computing (hundreds of compute cores) to the desktop at a reasonable cost, but requires that algorithms be carefully designed to take advantage of this power. The present work explores the possibilities of CUDA (NVIDIA Compute Unified Device Architecture) using GPGPU for Quadratic Discriminant (QD) analysis. QD...
A parallel Back-Propagation(BP) neural network training technique using Compute Unified Device Architecture (CUDA) on multiple Graphics Processing Units(GPUs) is proposed. To exploit the maximum performance of GPUs, we propose to implement batch mode BP training by building input neurons, hidden neurons and output neurons into matrix form. The implementation includes CUDA Basic Linear Algebra Subroutines...
Support vector machine (SVM) is a popular classifier dealing with small-scale datasets. It has outstanding performance compared to other classifiers. However the execution time is extremely long when training Big Data. The Graphics Processing Unit (GPU) is a massively parallel device which performs very well as a co-processor. NVIDIA proposed a programming platform, CUDA, in 2006, which makes it much...
Eigenface is one of the most common appearance based approaches for face recognition. Eigenfaces are the principal components which represent the training faces. Using Principal Component Analysis, each face is represented by very few parameters called weight vectors or feature vectors. While this makes testing process easy, it also includes cumbersome process of generating eigenspace and projecting...
This paper proposes a real-time face recognition system based on the Compute Unified Device Architecture (CUDA) platform, which effectively completed the face detection and recognition tasks. In the face detection phase with Viola-Jones cascade classifier, we implemented and improved novel parallel methodologies of image integral, calculation scan window processing and the amplification and correction...
This paper introduce a parallel computing method to improve the efficiency of prediction of membrane protein types by SVM. With early hardware limitations of the GPU(lack of synchronization primitives and limited memory caching mechanisms)can make GPU-based computation inefficient. We present this efficient method for prediction of membrane protein type for Intel(R) Core(TM) i3–3110m quad-core and...
Parallel implementation of neural networks is amongst major area of research in computer science. Self Organizing Map (SOM) is a neural network that has been under spotlight throughout last decade for implementation in parallel architecture. SOM trains itself through unsupervised learning by retrieving inherent topological features of applied input data. In this paper design and implementation of...
Facial recognition techniques are of interest for tracking and identification in densely populated areas where security is an important concern. Traditional recognition techniques have yielded acceptable results with high repeatability but require special conditions such as a voluntary and stationary subject, close proximity, and appropriate lighting. Because no single algorithm yields robust results...
Liquid chromatography-based tandem mass spectrometry (LC-MS) technique allows for identification and quantification of thousands of proteins in parallel. This technique coupled with a feed-forward artificial neural network provides a technique to analyze and select protein panels for use in multi-biomarker panel discovery applications. In this study, we enhance this technique by utilizing massively...
The training procedure of Hidden Markov Model (HMM) based Speech Recognition is often very time consuming because of its high computational complexity. The new parallel hardware like GPU can provide multi-thread processing and very high floating-point capability. We take advantage of GPU to accelerate a popular HMM-based Speech Recognition package ¨C HTK. Based on the sequential code of HTK, we design...
The Artificial Neural Networks (ANN) training represents a time-consuming process in machine learning systems. In this work we provide an implementation of the back-propagation algorithm on CUDA, a parallel computing architecture developed by NVIDIA. Using CUBLAS, a CUDA implementation of the Basic Linear Algebra Subprograms library (BLAS), the process is simplified, however, the use of kernels was...
The accuracy of Conditional Random Fields (CRF) is achieved at the cost of huge amount of computation to train model. In this paper we designed the parallelized algorithm for the Gradient Ascent based CRF training methods for biological sequence alignment. Our contribution is mainly on two aspects: 1) We flexibly parallelized the different iterative computation patterns, and the according optimization...
An algorithm for evolving recurrent neural network via the genetic algorithm was implemented on the CUDA, resulting in a system called CuParcone (CUDA based Partially Connected Neural Evolutionary). Run on a Nvidia Tesla “GPU supercomputer, ” CuParcone achieves a performance increase of 323 times in face gender recognition compared to the comparable Parcone algorithm on a state-of-the-art, commodity...
In this paper we describe the implementation of a complete ANN training procedure for speech recognition using the block mode back-propagation learning algorithm. We exploit the high performance SIMD architecture of GPU using CUDA and its C-like language interface. We also compare the speed-up obtained implementing the training procedure only taking advantage of the multi-thread capabilities of multi-core...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.