The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this study, a fast universal background support imposter data selection method is proposed, which is integrated within a support vector machine (SVM) based speaker verification system. Selection of an informative background dataset is crucial in constructing a discriminative decision super-plane between the enrollment and imposter speakers. Previous studies generally derive the optimal number of...
With the high development of Internet, e-commerce websites now routinely have to work with log datasets which are up to a few terabytes in size. How to remove messy data timely with low cost and find out useful information is a problem we have to face. The mining process involves several steps from pre-processing the raw data to establishing the final models. In this paper we describe our method to...
Most previous approaches to automatic audio events (AEs) annotation are based on supervised learning which relies on the availability of a labeled corpus to train classification models. However, instance annotation is often difficult, expensive, and time consuming. In this paper, we apply semi-supervised learning with transductive Support Vector Machine (TSVM) algorithm to automatic AEs annotation...
Automatic Dialect Classification (ADC) has recently gained substantial interest in the field of speech processing. Dialects of a language normally are reflected in terms of their phoneme space, word pronunciation/selection, and prosodic traits. These traits are clearly visible in natural speaker-to-speaker spontaneous conversations. However, dialect cues in prompted/read speech are often neglected...
A new method based on multi-model weighting for maximum likelihood estimation (MLE) is proposed in this paper. In order to ease the assumptions of maximum likelihood training, the model is generated based on the weight of multi-model which were trained with the divided training data respectively, the weight is gained according to the principle that the higher ratio of inter-variance to intra-variance...
Speech recognition systems are usually trained using tremendous transcribed utterances, and training data preparation is intensively time-consuming and costly. Aiming at reducing the number of training examples to be labeled, active learning is used in acoustic modeling of speech recognition, this learning scheme iteratively inspects the unlabeled samples, selects the most informative samples corresponding...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.