The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper presents a multi-channel/multi-speaker 3D audio-visual corpus for Mandarin continuous speech recognition and other fields, such as speech visualization and speech synthesis. This corpus consists of 24 speakers with about 18k utterances, about 20 hours in total. For each utterance, the audio streams were recorded by two professional microphones in near-field and far-field respectively, while...
The Mandarin speech always involves a rich set of regional accents, so that modeling the acoustic variabilities imposed by accents is a challenging task for Mandarin speech recognition. This work investigated using limited accented data to design a multi-accent decision tree, so as to improve the recognition accuracy of traditional GMM-HMM systems. Moreover, the deep neural networks with senone/monophone...
The present project involved the development of a novel interactive speech training system based on virtual reality articulation and examination of the efficacy of the system for hearing impaired (HI) children. Twenty meaningful Mandarin words were presented to the HI children via a 3-D talking head during articulation training. Electromagnetic Articulography (EMA) and graphic transform technology...
IELS, which abbreviates Interactive English Learning System, is a computer assisted pronunciation training (CAPT) system for Chinese learners of English whose mother language is Mandarin. The system provides instant feedback of mispronunciations of phoneme, word, lexical stress, and a score of the student's overall pronunciation quality. The system employs client-server architecture, in which the...
Application of linguistic knowledge of language transfer to automatic speech recognition (ASR) technology can enhance mispronunciation detection performance in computer-aided pronunciation training (CAPT). This is achieved by pinpointing salient pronunciation errors made by second language learners. In this work, we propose to apply decision fusion for further improvement in mispronunciation detection...
This paper discusses a novel e-learning system based on automatic speech recognition (ASR). Transcriptions from ASR can be used for subtitle generation and synchronization, slide synchronization, outline navigation, and content-based video retrieval. So the system can deal with multi-media resources in e-learning automatically. As the core module in this system, the speech recognition system can reach...
This paper presents a mispronunciation detection system which uses automatic speech recognition to effectively detect the phone-level mispronunciations in the Cantonese learners of English. Our approach extends a target pronunciation lexicon with possible phonetic confusions that may lead to pronunciation errors to generate an extended pronunciation lexicon that contains both target pronunciations...
This paper presents a method using speech recognition with linguistic constraints to detect the mispronunciations made by Cantonese learners of English. The predicted pronunciation errors have been derived from cross-language phonological comparisons, which are used to generate the erroneous pronunciation variations in a lexicon. The acoustic models are trained with native speakerspsila speech and...
This work aims to derive salient mispronunciations made by Chinese (L1 being Cantonese) learners of English (L2 being American English) in order to support the design of pedagogical and remedial instructions. Our approach is grounded on the theory of language transfer and involves systematic phonological comparison between two languages to predict possiblephoneticconfusions that may lead to mispronunciations...
Emotion deficiency was a hot topic in present e-learning research. The author analyzed many negative effects of emotion deficiency and proposed a lot of corresponding countermeasures. Basing on it, affective computing was applied in the traditional e-learning system. The model of e-learning system based on affective computing was constructed by using speech emotion, which took speech feature as input...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.