2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

chapter

Boosting DNN-based speech enhancement via explicit transformations

Qing Wang, Jun Du, Li-Rong Dai

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 4

In this study, we investigate on the learning behaviors of DNN by explicit feature transformations. As a demonstration, linear and logarithm transformations, corresponding to the amplitude spectra and log-power spectra, are compared with the same minimum mean squared error (MMSE) objective function for optimizing DNN parameters. Based on the experimental analysis of the DNN learning behaviors, we...

chapter

A noise masking method with adaptive thresholds based on CASA

Feng Bao, Waleed H. Abdulla

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 5

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In this paper, we propose a novel noise masking method based on Computational Auditory Scene Analysis by using an adaptive factor. Although it has succeeded in the field of speech separation and speech enhancement to some extent, the usage of fixed thresholds used for segregation and labeling heavily affects the processing performance. Focusing on this issue, the proposed method utilizes the Normalized...

chapter

Speech enhancement method with geometric phase estimation by incorporating MIXMAX model

Xianyun Wang, Changchun Bao

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 4

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In this paper, we propose a frequency-domain speech enhancement algorithm with phase estimation, in which the speech model is modeled by a Gaussian mixture model (GMM) in the log-spectral domain and two closed-form log-spectral amplitude estimators for speech and noise are derived directly by using a Mixture-Maximum (MIXMAX) model. Because the accurate estimation of speech phase could help to reduce...

chapter

Improved ETSI advanced front-end for ASR based on robust complex speech analysis

Keita Higa, Keiichi Funaki

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 4

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

An automatic speech recognition (ASR) is commonly used in these days. Current ASR systems perform well in ideal environment, however it does not perform well in realistic noisy environment. As a robust ASR, ETSI has standardized Advanced Front-End (AFE) that adopts two-stage of iterative Wiener filter (IWF) to realize a speech enhancement as the front-end of ASR. In the ETSI AFE, FFT is used to estimate...

chapter

Audio-visual speech enhancement using deep neural networks

Jen-Cheng Hou, Syu-Siang Wang, Ying-Hui Lai, Jen-Chun Lin, more

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 6

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

This paper proposes a novel framework that integrates audio and visual information for speech enhancement. Most speech enhancement approaches consider audio features only to design filters or transfer functions to convert noisy speech signals to clean ones. Visual data, which provide useful complementary information to audio data, have been integrated with audio data in many speech-related approaches...

chapter

Enhancing a glossectomy patient's speech via GMM-based voice conversion

Kei Tanaka, Sunao Hara, Masanobu Abe, Shogo Minagi

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 4

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In this paper, we describe the use of a voice conversion algorithm for improving the intelligibility of speech by patients with articulation disorders caused by a wide glossectomy and/or segmental mandibulectomy. As a first trial, to demonstrate the difficulty of the task at hand, we implemented a conventional Gaussian mixture model (GMM)-based algorithm using a frame-by-frame approach. We compared...

chapter

Optimal automatic speech recognition system selection for noisy environments

Yuuki Tachioka, Tomohiro Narita

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 8

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

To improve the performance of noisy automatic speech recognition (ASR), it is effective to prepare multiple ASR systems that can address the large varieties of noise. However, the optimal ASR system is different for each environment and mismatches between training and testing degrade ASR performance. In this situation, the overall system combination of multiple systems is effective; however, the computational...

INFONA - science communication portal

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Boosting DNN-based speech enhancement via explicit transformations

A noise masking method with adaptive thresholds based on CASA

Speech enhancement method with geometric phase estimation by incorporating MIXMAX model

Improved ETSI advanced front-end for ASR based on robust complex speech analysis

Audio-visual speech enhancement using deep neural networks

Enhancing a glossectomy patient's speech via GMM-based voice conversion

Optimal automatic speech recognition system selection for noisy environments

Filter options

Publication date

Keywords

INFONA - science communication portal

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) $("#expandableTitles").expandable();

Boosting DNN-based speech enhancement via explicit transformations

A noise masking method with adaptive thresholds based on CASA

Speech enhancement method with geometric phase estimation by incorporating MIXMAX model

Improved ETSI advanced front-end for ASR based on robust complex speech analysis

Audio-visual speech enhancement using deep neural networks

Enhancing a glossectomy patient's speech via GMM-based voice conversion

Optimal automatic speech recognition system selection for noisy environments

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)