The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper proposes an approach to perform accent adaptation by using accent dependent bottleneck (BN) features to improve the performance of multi-accent Mandarin speech recognition system. The architecture of the adaptation uses two neural networks. First, deep neural network (DNN) acoustic model acts as a feature extractor which is used to extract accent dependent BN (BN-DNN) features. The input...
This paper presents a novel speaker recognition framework that handles duration mismatch between registered and test utterances. The i-vectors extracted from short utterances exhibit high variance due to phoneme imbalance, which causes performance degradation in the duration mismatch condition. Most conventional methods attempt to decrease the variance by offsetting i-vectors or speaker similarity...
This paper presents a biometric personal-authentication method that exploits acoustic characteristics of human ears. It transmits a probe signal into the ear and receives its reflection, which contains personal identity information about the shape of the ear canal. Based on a study of effective and efficient acoustic feature representation and the use of audio equipment suitable for acquiring features...
Detecting pronunciation erroneous tendency (PET) can provide second languages learners with detailedly instructive feedbacks in the computer aided pronunciation training (CAPT) systems. Due to the data sparseness, DNN-HMM achieved limited improvement over GMM-HMM in our previous work. Instead of directly employing DNN-HMM to detect PETs, this paper investigated how to further improve the performance...
A musical chord is usually described by its root note and the chord type. While a substantial amount of work has been done in the field of music information retrieval (MIR) to automate chord recognition, the role of root notes in this task has seldom received specific attention. In this paper, we present a new approach and empirical studies demonstrating improved accuracy in chord recognition by properly...
In this paper, we apply Locality Sensitive Discriminant Analysis (LSDA) to speaker verification system for intersession variability compensation. As opposed to LDA which fails to discover the local geometrical structure of the data manifold, LSDA finds a projection which maximizes the margin between i-vectors from different speakers at each local area. Since the number of samples varies in a wide...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.