The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Human machine interaction is one of the most burgeoning area of research in the field of information technology. To date a majority of research in this field has been conducted using unimodal and multimodal systems with asynchronous data. Because of the above, the improper synchronization, which has become a common problem, due to that, the system complexity increases and the system response time...
The goal of this paper is to identify gender of blog authors. Features such as POS tags, unigram (words+punctuations), bigrams and word classes are considered. To synthesis/rank features we are using Mutual information, Chi-square and Information gain methods. The dataset is the collection of 3227 blogs originally derived from blogs set, and among them 1679 were written by male and 1548 were written...
This paper presents a novel automatic method to de- termine the appropriate age of video content in a video database geared to children. When combined with classical features the system improves accuracy rate for more than 0.13 for the same type of classifier in determining the age category of content for children between the ages of three to six years old. The main novelty of the system is that it...
We developed a speaker verification system that is efficient for short utterances. The i-vector-based speaker representation has helped realize highly accurate speaker verification systems, however, it might be not robust against short utterances because the reliability of statistics required for extracting i-vectors is low. On the other hand, multiple kernel learning based on conditional entropy...
A phoneme recognition system based on Discrete Wavelet Transforms (DWT) and Support Vector Machines (SVMs), is designed for multi-speaker continuous speech environments. Phonemes are divided into frames, and the DWTs are adopted, to obtain fixed dimensional feature vectors. For the multiclass SVM, the One-against-one method with the RBF kernel was implemented. To further improve the accuracies obtained,...
The proposed identification system for mixed anuran vocalizations is to provide the public to easily consult online. The raw mixed anuran vocalization samples are first filtered by noise removal, high frequency compensation, and discrete wavelet transform techniques in order. An adaptive end-point detection segmentation algorithm is proposed to effectively separate the individual syllables from the...
In this article a text-independent speaker verification problem is considered. After the feature extraction, each conversation side has been represented as a vector in a fixed dimensional space. In order to reduce an influence of the lengths of utterances and also the channel properties, various vector normalization techniques have been selected from the literature, modified, and tested. Additionally,...
Four multiclass Support Vector Machines (SVMs) methods were designed for the task of speaker independent phoneme recognition. These are the All-at-once, One-against-all, One-against-one, and the Directed Acyclic Graph SVM (DAGSVM). The Discrete Wavelet Transform (DWT) 8 frequency band power percentages are used for feature extraction. All tests were carried out on the TIMIT database. Comparable recognition...
Variable bit-rate coding introduced for effective utilization of limited communication bandwidth requires accurate classification of input signals. This paper investigates implementation of a support vector machine (SVM)-based speech/music classifier in the selectable mode vocoder (SMV) framework, which is a standard codec adopted by the Third-Generation Partnership Project 2 (3GPP2). A support vector...
This paper shows that pattern classification based on machine learning is a powerful tool to analyze human brain activity data obtained by magnetoencephalography (MEG). We propose a new weighting method using a multiple kernel learning (MKL) algorithm to localize the brain area contributing to the accurate vowel discrimination. Our MKL simultaneously estimates both the classification boundary and...
An important task in Music Information Retrieval is content-based similarity retrieval in which given a query music track, a set of tracks that are similar in terms of musical content are retrieved. A variety of audio features that attempt to model different aspects of the music have been proposed. In most cases the resulting audio feature vector used to represent each music track is high dimensional...
A classification system that accurately categorizes caller behavior within Interactive Voice Response systems would assist in developing good automated self service applications. This paper details the implementation of such a classification system for a pay beneficiary application. Adaptive Neuro-Fuzzy Inference System (ANFIS), Feed forward Artificial Neural Network (ANN) and Support Vector Machine...
In this article we propose a quantitative approach to a relatively new problem: categorizing text as pragmatically correct or pragmatically incorrect (forcing the notion, coherent/incoherent). The typical text categorization criterions comprise categorization by topic, by style (genre classification, authorship identification), by expressed opinion (opinion mining, sentiment classification), etc....
SVM is a novel type of statistical learning method that has been successfully used in speaker recognition. However, training SVM consumes long computing time and large storage space with all training examples. This paper proposes an improved sparse least-squares support vector machine (LS-SVM) for speaker identification. Firstly KPCA is exploited to reduce the dimension of input vectors and to denoise...
The goal of this paper is to present experimental results for the automatic recognition of dysfluencies in the stuttered speech. Mel Frequency Cepstral Coeficients reduce the dimensionality of data and models of acoustic waves of human speech. The acoustic model contains the feature vectors of speech used for further processing with Support Vector Machine. SVM classifier with kernel functions efficiently...
We address the problem of computing the level-crossings of an analog signal from samples measured on a uniform grid. Such a problem is important, for example, in multilevel analog-to-digital (A/D) converters. The first operation in such sampling modalities is a comparator, which gives rise to a bilevel waveform. Since bilevel signals are not bandlimited, measuring the level-crossing times exactly...
To enhance recognition accuracy of isolated words identification with small samples in lipreading, SVM is first introduced to act as classifier in this paper. As SVM is based on structural risk minimization, it solves the problem of pattern recognition under small samples, on the other hand, it avoids the unreasonable hypothesis in traditional classifier. To meet the requirement of fixed input feature...
Many improvements are developed for echo hiding system, but there is not an intensive study on the hiding capacity of echo hiding. Based on the speech signals with various sampling rate and single echo hiding scheme, this work explores the regular pattern of recovery accuracy and fragment length, and presents that the hiding capacity of speech clip is 55bit/s, and the capacity is not related to the...
We present a novel approach to automatic speaker age classification, which combines regression and classification to achieve competitive classification accuracy on telephone speech. Support vector machine regression is used to generate finer age estimates, which are combined with the posterior probabilities of well-trained discriminative gender classifiers to predict both the age and gender of a speaker...
We propose a Gaussian process based regression scheme that provides a direct estimation of the height of unknown speakers and is applicable to real-world autonomous surveillance applications. This scheme relies on utterance-level speech parameterization followed by regression modelling, which estimates the height of the speaker and the uncertainty interval of that estimation. Experiments on the TIMIT...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.