2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

Items from 1 to 20 out of 137 results

book

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

IEEE

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

chapter

Effects of preceding vocabulary context on the perception of Mandarin vowels

Xunan Huang, Caicai Zhang, Fei Chen, Jonathan Sieg, more

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

This study compares the perceptual performance of Mandarin basic vowels “e” (/ɤ/) and “u” (/u/) in different contexts (independent & contextual). Results indicate that perception of the target vowel is influenced by the adjacent vowel context in a contrastive manner in both identification and discrimination tests. Moreover, in a context of higher F1 and F2, listeners found it more difficult to...

chapter

The correlation between signal distance and consonant pronunciation in Mandarin words

Huijun Ding, Chenxi Xie, Lei Zeng, Yang Xu, more

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

In Mandarin language speaking, some consonant and vowel pairs are hard to be distinguished and pronounced clearly even for some native speakers. This study investigates the signal distance between consonants compared in pairs from the signal processing point of view to reveal the correlation of signal distance and consonant pronunciation. Some popular speech quality objective measures are innovatively...

chapter

A study on perceptual training of Japanese CSL learners to discriminate Mandarin lexical tones

Feiya Li, Yanlu Xie, Xiaomin Yu, Jinsong Zhang

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

In process of learning Chinese as a second language (CSL), Japanese natives have difficulties in tone perception. Among the four Chinese lexical tones, the tone pairs Tone 1-Tone 2 and Tone 1-Tone 4 are problematic for Japanese CSL beginners. In order to help them develop efficiently discriminating capability of the tone pairs, we designed a hybrid perceptual training scheme which combined adaptive...

chapter

Learning auxiliary categorical information for speech synthesis based on deep and recurrent neural networks

Zhengqi Wen, Kehuang Li, Zhen Huang, Jianhua Tao, more

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

We proposed an auxiliary categorization framework for training speech synthesis systems using deep neural networks (DNNs) and recurrent neural networks (RNNs). The adopted artificial neural networks (ANNs) are regression models comprising a few hidden layers and an affine-transform layer for transforming the contextual features into a set of speech synthesis parameters. In order to incorporate categorization...

chapter

Gender and prosodic entrainment in Mandarin conversations

Zhihua Xia, Qiu Wu Ma

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 4

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

This study aims to find out how gender affects prosodic entrainment in Mandarin conversation. Based on the analyses of Tongji Games Corpus, it is found that in Mandarin conversations, mixed the gender groups entrain on the greatest number of features and males entrain on the least; A cross-linguistic comparison between Mandarin Chinese and English finds striking similarities over the number of prosodic...

chapter

Mandarin neutral tone by native speakers and Cantonese L2 learners

Lei Liu, Nan Huang, Wentao Gu

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

This study compared neutral tone (T0) of Mandarin produced by native speakers and by Cantonese L2 learners, using both acoustic analysis and perceptual experiment. The T0 syllables after four different tones in three word contexts (i.e., isolated, non-focused, and on-focus) were investigated. The perceptual experiment showed that T0 in the L2 group obtained a lower rate of acceptance than in the L1...

chapter

The influence of syllable structure and prosodic strengthening on consonant production in Shanghai Chinese

Bijun Ling, Jie Liang

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

The present study investigated the effects of syllable structure and prosodic strengthening on the consonant production in SHC, which has a three-way contrast among aspirated, unaspirated and breathy (voiced) stops. Obviously they had different mechanisms, as glottal coda shortened the VOT while focus lengthened the VOT of aspirated and breathy stops, but they both increased the intensity. While the...

chapter

HMM-based cue parameters estimation for speech enhancement

Feng Deng, Chang-chun Bao, Mao-shen Jia

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 4

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

In this paper, a hidden Markov model (HMM)-based cue parameters estimation method for single-channel speech enhancement is proposed, in which the cue parameters of binaural cue coding (BCC) are applied to single-channel speech enhancement system successfully. First, the clean speech and noise signals are considered as the left and right channels of stereo signal, respectively; and the noisy speech...

chapter

Towards automatic assessment of aphasia speech using automatic speech recognition techniques

Ying Qin, Tan Lee, Anthony Pak Hin Kong, Sam Po Law

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 4

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

Aphasia is a type of acquired language impairment caused by brain injury. This paper presents an automatic speech recognition (ASR) based approach to objective assessment of aphasia patients. A dedicated ASR system is developed to facilitate acoustical and linguistic analysis of Cantonese aphasia speech. The acoustic models and the language models are trained with domain- and style-matched speech...

chapter

A sparse representation of the excitation source characteristics of nonnormal speech sounds

Vinay Kumar Mittal, B. Yegnanarayana

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

The impulse-sequence representation of the excitation source information in normal speech signal has been explored for speech coding. Such a representation, if can be developed for paralinguistic and emotional speech sounds, would help in their acoustic analyses. This paper proposes a sparse representation of the excitation source characteristics of nonnormal speech sounds signal, in terms of a time-domain...

chapter

Cluster-based senone selection for the efficient calculation of deep neural network acoustic models

Jun-Hua Liu, Zhen-Hua Ling, Si Wei, Guo-Ping Hu, more

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

In this paper, we propose a cluster-based senone selection method to speed up the computation of deep neural networks (DNN) at the decoding time of speech recognition. In DNN-based acoustic models, the large number of senones at the output layer is one of the main causes that lead to the high computation complexity of DNNs. Inspired by the mixture selection method designed for the Gaussian mixture...

chapter

The perception of the English alveolar-velar nasal coda contrast by monolingual versus bilingual Chinese speakers

Minghui Wu, Marjoleine Sloos, Jeroen van de Weijer

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

Relatively little research has addressed the role of LI in the perception of English speech contrasts by Chinese learners of English as L3. The present study investigates the role of LI in the perception of the English alveolar-velar nasal coda contrast (/n/ vs. /η/) after the vowels /i Λ æ/ by bilingual Changsha Chinese speakers, whose LI is Changsha Chinese and L2 is Standard Mandarin. Changsha...

chapter

Prosodic strength intrinsic to lexical items: A corpus study on tone reduction in Tone4+Tone4 words in Mandarin Chinese

Wei Lai, Mark Liberman, Jiahong Yuan, Xiaoying Xu

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

This study explored word-level prosodic strength in Mandarin Chinese reflected by tone reduction on the second syllables in Tone4+Tone4 words, by examining the slope difference between the two consecutive tones as an indicator for tonal reduction. It was found that firstly, the occurrence of tonal reduction is dependent on the internal structure of the word: words formed by apposition, (pseudo-)suffixation...

chapter

Robust front-end for speech recognition by human and machine in noisy reverberant environments: The effect of phase information

Yang Liu, Naushin Nower, Shota Morita, Masashi Unoki

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

This paper proposes a robust front-end for speech applications based on restoration scheme of instantaneous amplitude and phase. Typical applications such as hearing aids and automatic speech recognition systems still have challenging issues with regard to robustness against noise and reverberation. The proposed front-end employed a combination of our previously proposed method for restoring instantaneous...

chapter

Unsatisfied customer call detection with deep learning

Pengyu Cong, Chaomin Wang, Zhijie Ren, Huixin Wang, more

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

In this paper, we describe our practical efforts for applying speech emotion recognition(SER) in customer care scenarios. We systematically analyze the challenges we observe in our data, which are very different from speech emotion databases uttered by actors. Our contributions are two-fold. One, we propose a 2-level framework to measure the customers satisfaction score on the conversation level....

chapter

Investigating gated recurrent neural networks for acoustic modeling

Yuanyuan Zhao, Jie Li, Shuang Xu, Bo Xu

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

Recurrent neural networks (RNNs) with a gating mechanism have been shown to give state-of-the-art performance in acoustic modeling, such as gated recurrent unit (GRU), long short-term memory (LSTM), long short-term memory projected (L-STMP), etc. But little is known about why these gated RNNs work and what the differences are among these networks. Based on a series of experimental comparison and analysis,...

chapter

Microphone array speech denoising modeled by tensor filtering

Jing Wang, Yahui Shan, Shequan Jiang, Xiang Xie

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

This paper proposes a novel speech denoising method based on tensor filtering, in which the microphone array speech signal is constructed by tensor data and processed by tensor filtering model. The multi-microphone signal is represented with three-order tensor space in the way of channel, time and frequency. Noise can be reduced by finding the lower-rank approximation of the three-order tensor with...

chapter

[Front matter]

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 95

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

Conference proceedings front matter may contain various advertisements, welcome messages, committee or program information, and other miscellaneous conference information. This may in some cases also include the cover art, table of contents, copyright statements, title-page or half title-pages, blank pages, venue maps or other general information relating to the conference that was part of the original...

chapter

Improving accented Mandarin speech recognition by using recurrent neural network based language model adaptation

Hao Ni, Jiangyan Yi, Zhengqi Wen, Bin Liu, more

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

In this paper, we propose adapt the recurrent neural network (RNN) based language model to improve the performance of multi-accent Mandarin speech recognition. N-gram based language model has already been applied to speech recognition system, but it is hard to describe the long span information in a sentence and arises a serious phenomenon of data sparse. Instead, RNN based language model can overcome...

Publication date

Set your own date range

Content availability

Available (136)
None (1)

Keywords

SPEECH (90)
TRAINING (56)
ACOUSTICS (43)
SPEECH RECOGNITION (34)
HIDDEN MARKOV MODELS (29)
FEATURE EXTRACTION (27)
CONTEXT (21)
NOISE MEASUREMENT (14)
PRODUCTION (14)
NEURAL NETWORKS (13)
ANALYSIS OF VARIANCE (12)
PRAGMATICS (12)
ADAPTATION MODELS (11)
STANDARDS (11)
COMPUTATIONAL MODELING (10)
LOGIC GATES (10)
DATA MODELS (9)
DATABASES (9)
DECODING (9)
DEEP NEURAL NETWORK (9)
INDEXES (9)
MATHEMATICAL MODEL (9)
TESTING (9)
ARTIFICIAL NEURAL NETWORKS (8)
SPEECH PROCESSING (8)
AUDITORY SYSTEM (7)
CORRELATION (7)
ENCODING (7)
LABELING (7)
NOISE REDUCTION (7)
PREDICTIVE MODELS (7)
SIGNAL TO NOISE RATIO (7)
SPEECH ENHANCEMENT (7)
SPEECH SYNTHESIS (7)
TONGUE (7)
LINEAR PROGRAMMING (6)
SHAPE (6)
SPEAKER RECOGNITION (6)
STRESS (6)
ACOUSTIC MEASUREMENTS (5)
CONTEXT MODELING (5)
DECISION SUPPORT SYSTEMS (5)
ERROR ANALYSIS (5)
ESTIMATION (5)
MANDARIN (5)
MICROPHONES (5)
RECURRENT NEURAL NETWORKS (5)
SEMANTICS (5)
AUTOMATIC SPEECH RECOGNITION (4)
LONG SHORT-TERM MEMORY (4)
MEASUREMENT (4)
MEL FREQUENCY CEPSTRAL COEFFICIENT (4)
RECURRENT NEURAL NETWORK (4)
SPECTROGRAM (4)
SPEECH PERCEPTION (4)
THREE-DIMENSIONAL DISPLAYS (4)
TRAINING DATA (4)
TRANSFORMS (4)
ALGORITHM DESIGN AND ANALYSIS (3)
BAYES METHODS (3)
CATEGORICAL PERCEPTION (3)
COMPUTER ARCHITECTURE (3)
CONNECTIONIST TEMPORAL CLASSIFICATION (3)
DEEP LEARNING (3)
DEEP NEURAL NETWORKS (3)
DICTIONARIES (3)
ELECTROENCEPHALOGRAPHY (3)
FOCUS (3)
I-VECTOR (3)
LANGUAGE MODEL (3)
LATTICES (3)
LIPS (3)
LOGISTICS (3)
MULTI-TASK LEARNING (3)
NIST (3)
REGISTERS (3)
ROBUSTNESS (3)
SILICON (3)
SOLID MODELING (3)
SPEAKER ADAPTATION (3)
SPEAKER VERIFICATION (3)
SPEECH CODING (3)
SPEECH PRODUCTION (3)
SYNTACTICS (3)
TRAJECTORY (3)
VISUALIZATION (3)
VOCABULARY (3)
VOWEL (3)
ACOUSTIC MODELING (2)
ADAPTATION (2)
ARTICULATORY FEATURES (2)
AUTOENCODER (2)
AUTOMOBILES (2)
AZIMUTH (2)
BIDIRECTIONAL CONTROL (2)
BLSTM-RNN (2)
CAMERAS (2)
CAVITY RESONATORS (2)
CHINESE READING TEXTS (2)
COCHLEAR IMPLANTS (2)
more

INFONA - science communication portal

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)