Search results for: X. Zou

Items from 1 to 6 out of 6 results

chapter

A deep convolutional encoder-decoder model for robust speech dereverberation

D. S. Wang, Y. X. Zou, W. Shi

2017 22nd International Conference on Digital Signal Processing (DSP) > 1 - 5

2017 22nd International Conference on Digital Signal Processing (DSP)

Research shows that speech dereverberation (SD) with Deep Neural Network (DNN) achieves the state-of-the-art results by learning spectral mapping, which, simultaneously, lacks the characterization of the local temporal spectral structures (LTSS) of speech signal and calls for a large storage space that is impractical in real applications. Contrarily, the Convolutional Neural Network (CNN) offers a...

chapter

An experimental study of speech emotion recognition based on deep convolutional neural networks

W. Q. Zheng, J. S. Yu, Y. X. Zou

2015 International Conference on Affective Computing and Intelligent Interaction (ACII) > 827 - 831

2015 International Conference on Affective Computing and Intelligent Interaction (ACII)

Speech emotion recognition (SER) is a challenging task since it is unclear what kind of features are able to reflect the characteristics of human emotion from speech. However, traditional feature extractions perform inconsistently for different emotion recognition tasks. Obviously, different spectrogram provides information reflecting difference emotion. This paper proposes a systematical approach...

chapter

Forming ad-hoc microphone arrays through clustering of acoustic room impulse responses

S. Pasha, Y. X. Zou, C. Ritz

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 84 - 88

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

This paper investigates the formation of ad-hoc microphone arrays for the purpose of recording multiple sound sources by clustering microphones spatially distributed within a room. A novel codebook-based unsupervised method for cluster formation using features derived from the Room Impulse Responses (RIRs) corresponding to each microphone is proposed and compared with baseline clustering and classification...

chapter

Nonnegative matrix factorization based noise robust speaker verification

S. H. Liu, Y. X. Zou, H. K. Ning

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 35 - 39

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

The performance of speaker verification system (SVS) declines dramatically in noisy environments. To suppress the adverse impact of the noise on SVS, this paper investigates employing the nonnegative matrix factorization (NMF) technique to reconstruct the speech based on the pre-trained speech basis matrix (SBM) and noise basis matrix (NBM). The contribution of this research lies in utilizing the...

chapter

Multi-pronounciation dictionary construction for Mandarin-English bilingual phrase speech recognition system

C. Wang, W. Shi, Y. X. Zou

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 15 - 19

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

Generally, in multi-lingual communities, non-native speakers may produce speech sound which is either part of their own native language or established via merging characteristics of native pronunciation with non-native pronunciation. Recently, a Two-pass phone clustering based on Confusion Matrix (TCM) approach has been proposed to address the one-to-one phone mappings between Chinese syllables and...

chapter

Spectral mask estimation using deep neural networks for inter-sensor data ratio model based robust DOA estimation

W. Q. Zheng, Y. X. Zou, C. Ritz

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 325 - 329

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Accurate DOA estimation based on clustering the inter-sensor data ratios (ISDRs) of a single acoustic vector sensor (AVS), referred as AVS-ISDR, relies on reliable extraction of time-frequency points with high local signal-to-noise ratio (HLSNR-TFPs) and its performance degrades in noisy environments. This paper investigates deep neural networks (DNNs) trained with noisy-clean speech pairs under different...

Filter options

Keywords:
SPEECH

Publication date

Set your own date range

Keywords

FEATURE EXTRACTION (3)
ACOUSTICS (2)
CONVOLUTION (2)
DECISION SUPPORT SYSTEMS (2)
INDEXES (2)
ACCENT ISSUE (1)
ACOUSTIC VECTOR SENSOR (1)
AD-HOC MICROPHONE ARRAYS (1)
BILINGUAL SPEECH RECOGNITION (1)
CODEBOOK BASED CLUSTERING (1)
DECODING (1)
DEEP CONVOLUTIONAL ENCODER-DECODER (DCED) (1)
DEEP CONVOLUTIONAL NEURAL NETWORKS (1)
DEEP NEURAL NETWORKS (1)
DICTIONARIES (1)
DIRECTION OF ARRIVAL ESTIMATION (1)
DIRECTION-OF-ARRIVAL ESTIMATION (1)
EMOTION RECOGNITION (1)
ESTIMATION (1)
HANDHELD COMPUTERS (1)
INFORMED CLUSTERED BEAMFORMING (1)
INITIALIZATION AND UPDATING OF THE PHONE SET (1)
INTER-SENSOR DATA RATIOS (1)
LOCAL TEMPORAL SPECTRAL STRUCTURES (LTSS) (1)
MANGANESE (1)
MICROPHONE ARRAYS (1)
MULTI-PRONUNCIATION DICTIONARY (1)
NOISE (1)
NOISE MEASUREMENT (1)
NONNEGATIVE MATRIX FACTORIZATION (1)
PRINCIPAL COMPONENT ANALYSIS (1)
PRINCIPLE COMPONENT ANALYSIS WHITENING (1)
REVERBERATION (1)
ROOM IMPULSE RESPONSE (1)
SIGNAL TO NOISE RATIO (1)
SPEAKER VERIFICATION (1)
SPECTRAL MAPPING (1)
SPECTRAL MASK ESTIMATION (1)
SPECTROGRAM (1)
SPEECH DEREVERBERATION (1)
SPEECH EMOTION RECOGNITION (1)
SPEECH ENHANCEMENT (1)
SPEECH PROCESSING (1)
SPEECH RECOGNITION (1)
SPEECH SPECTROGRAM (1)
STORAGE SPACE (1)
TESTING (1)
TIME CORRELATION (1)
TIME DELAYS OF ARRIVALS (1)
TRAINING (1)
TRAINING DATA (1)
more

INFONA - science communication portal

Search results for: X. Zou

A deep convolutional encoder-decoder model for robust speech dereverberation

An experimental study of speech emotion recognition based on deep convolutional neural networks

Forming ad-hoc microphone arrays through clustering of acoustic room impulse responses

Nonnegative matrix factorization based noise robust speaker verification

Multi-pronounciation dictionary construction for Mandarin-English bilingual phrase speech recognition system

Spectral mask estimation using deep neural networks for inter-sensor data ratio model based robust DOA estimation

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options