The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In speech recognition system, an improved multi-base neural network speech recognition model is proposed to solve the problem of long learning time and slow convergence rate of deep neural network. However, the improved model introduces a large number of parameters in the training process to make the model over-fitted in the test set, resulting in the deterioration of generalization ability and the...
In this paper we apply particle swarm optimization (PSO) feature selection to enhance Hidden Markov Model (HMM) states and parameters for face recognition systems. Ideal Feature selection for face images based on the idea of collaborative behavior of bird flocking to reduce the feature size and hence recognition time complicity. The framework has been inspected on 400 face pictures of the Olivetti...
This paper presents a review on few notable speech recognition models that are reported in the last decade. Firstly, the models are categorized into sparse models, learning models and domain - specific models. Subsequently, the characteristics of the models have been observed using speech constraints, algorithmic constraints and performance constraints. The performance of these models reported in...
This paper aimed at introducing a completely automated Arabic phone recognition system based on Enhanced Wavelet Packets Best Tree Encoding (EWPBTE) 15-point speech feature. The process of enhancing of WPBTE is provided by adding energy component to WPBTE, which is implemented in Matlab software and makes an enhancement of 65 % to recognizer accuracy which is the most contribution in this paper. EWPBTE...
The paper describes an experimental study on emotion recognition using a collection of emotional recordings from SRoL corpus. Its goal is to study and to obtain a simple tool that can be used in recordings validation in the process of building large voice corpora. The tools can help or even replace the human validation. In this study we used two classifiers, k-NN (k — Nearest Neighborhood) and SVM...
Spatial information describes the relative spatial position of an object in a video. Such information may aid several video analysis tasks such as object, scene, event and activity recognition. This paper studies the effect of spatial information on video activity recognition. The paper firstly performs activity recognition on KTH and Weizmann videos using Hidden Markov Model and k-Nearest Neighbour...
The demand of human identification in a non-intrusive manner has risen increasingly in recent years. Several works have already been done in this context using gait-cycle detection from human skeleton data using Microsoft Kinect as a data capture sensor. In this paper we have proposed a novel method for automatic human identification in real time using the fusion of both supervised and unsupervised...
Traffic identification technique is used for classification of different network protocols and applications even with detection of users' network activities. In this paper, we conduct our study on some typical users' network activities and present a traffic identification method to describe the feature about users' behaviors. We convert users' network activities information into different sequences...
Cry segmentation is an essential preprocessing step in any infant crying diagnosis system. Besides crying sounds consisting of expiration phases followed by short periods of inspiration episodes, each recording of newborn cries also includes silence sections as well as other sounds such as speech of caregivers, noise and sound of medical equipments. This paper is devoted to a newly developed Empirical...
Automatic language identification is a natural language processing problem that tries to determine the natural language of a given content. In this paper we present a statistical method for automatic language identification of written text using dictionaries containing stop words and diacritics. We propose different approaches that combine the two dictionaries to accurately determine the language...
In this paper we propose an improvement of a human action recognition method that uses a string-based representation and a string edit distance to compare the observed action with reference actions in the training set. In particular, the original improvement is based on a specific formulation of the string edit distance that is more suited to take into account the problems related to noise and to...
Anomaly detection systems rely on machine learning techniques to model the normal behavior of the system. This model is used during operation to detect anomalies due to attacks or design faults. Ensemble methods have been used to improve the overall detection accuracy by combining the outputs of several accurate and diverse models. Existing Boolean combination techniques either require an exponential...
Virtualized cloud systems are prone to performance anomalies due to various reasons such as resource contentions, software bugs, and hardware failures. It will be a daunting task for system administrators to manually keep track of the execution status of a large number of virtual machines all the time. Anomaly prediction is an effective approach to enhancing availability and reliability of Cloud infrastructures...
Human Computer Interaction would be much more smooth with the implementation of rapid recognition, the aim of which is to recognize the hand gesture before it is completed. In this paper, a rapid recognition for dynamic hand gestures using leap motion is proposed. The database contains the three-dimensional motion trajectory of the numbers and the alphabet (36 gestures in total) which captured by...
This paper introduces a novel approach to predict human motion for the Non-binding Lower Extremity Exoskeleton (NBLEX). Most of the exoskeletons must be attached to the pilot, which exists potential security problems. In order to solve these problems, the NBLEX is studied and designed to free pilots from the exoskeletons. Rather than applying Electromyography (EMG) and Ground Reaction Force (GFR)...
There are many shape-similar gestures which cause errors in the process of hand gesture recognition. In this paper, a new method which can distinguish the similar gestures was proposed. The information of motion trajectory is captured by a leap motion in three-dimension space, and the orientation characteristics are quantified and coded as the feature. Then the Hidden Markov Model (HMM) algorithm...
Parts of speech tagging is an important research topic in Natural Language Processing research are. Since it is one among the first steps of any natural language processing (NLP) techniques such as machine translation, if any error happens for tagging the same will repeat in the whole NLP process. So far works had been done on POS tagging based on SVM, MBLP, HMM, Ngram. All of these methods were not...
In this paper, we propose a new system for isolated sign language recognition (SLR) and continuous SLR. In isolated SLR, Histogram of Oriented Displacement is used to describe the trajectories, and multi-SVM is adopted for classification. In continuous SLR, we propose a Dynamic Programming method with warping templates obtained by Dynamic Time Warping (DTW) algorithm. We evaluate our approach with...
Sub-word units like morphemes are selected as the lexicon for highly inflectional languages, as they can provide better coverage and a smaller vocabulary size. However, short units shrink the context of statistical models, prone to morpho-phonetic changes, and not always outperform the word based model. When sequence of units are merged or split, unit boundaries are phonetically harmonized in the...
In low resource Automatic Speech Recognition (ASR), one usually resorts to the Statistical Machine Translation (SMT) technique to learn transform rules to refine grapheme lexicon. To do this, we face two challenges. One is to generate grapheme sequences from the training data as the targets, which is paired with the original transcripts to train SMT models; the other is to effectively prune the learned...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.