Effective articulatory modeling for pronunciation error detection of L2 learner without non-native training data

Richeng Duan; Tatsuya Kawahara; Masatake Dantsuji; Jinsong Zhang

doi:10.1109/ICASSP.2017.7953271

Effective articulatory modeling for pronunciation error detection of L2 learner without non-native training data

Duan, Richeng, Kawahara, Tatsuya, Dantsuji, Masatake, Zhang, Jinsong

Source

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5815 - 5819

Abstract

For effective articulatory feedback in computer-assisted pronunciation training (CAPT) systems, we address effective articulatory models of second language (L2) learners' speech without using such data, which is difficult to collect and annotate in a large scale. Context-dependent articulatory attributes (placement and manner of articulation) are modeled based on deep neural network (DNN). In order to efficiently train the non-native articulatory models, we exploit large speech corpora of native and target language to model inter-language phenomena. This multi-lingual learning is then combined with multi-task learning, which uses phone-classification as a sub-task. These methods are applied to Mandarin Chinese pronunciation learning by Japanese native speakers. Effects are confirmed in the native attribute classification and pronunciation error detection of non-native speech.