Search results for: Mark J.F. Gales

Items from 1 to 8 out of 8 results

chapter

Multi-basis adaptive neural network for rapid adaptation in speech recognition

Chunyang Wu, Mark J.F. Gales

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4315 - 4319

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Recent progress in acoustic modeling with deep neural network has significantly improved the performance of automatic speech recognition systems. However, it remains as an open problem how to rapidly adapt these networks with limited, unsupervised, data. Most existing methods to adapt a neural network involve modifying a large number of parameters thus rapid adaptation is not possible with these schemes...

chapter

Robust excitation-based features for Automatic Speech Recognition

Thomas Drugman, Yannis Stylianou, Langzhou Chen, Xie Chen, more

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4664 - 4668

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper we investigate the use of noise-robust features characterizing the speech excitation signal as complementary features to the usually considered vocal tract based features for Automatic Speech Recognition (ASR). The proposed Excitation-based Features (EBF) are tested in a state-of-the-art Deep Neural Network (DNN) based hybrid acoustic model for speech recognition. The suggested excitation...

chapter

Speaker dependent expression predictor from text: Expressiveness and transplantation

Langzhou Chen, Norbert Braunschweiler, Mark J.F. Gales

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2574 - 2578

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Automatically generating expressive speech from plain text is an important research topic in speech synthesis. Given the same text, different speakers may interpret it and read it in very different ways. This implies that expression prediction from text is a speaker dependent task. Previous work presented an integrated method for expression prediction and speech synthesis which can be used to model...

chapter

Multiple-average-voice-based speech synthesis

Pierre Lanchantin, Mark J.F. Gales, Simon King, Junichi Yamagishi

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 285 - 289

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper describes a novel approach for the speaker adaptation of statistical parametric speech synthesis systems based on the interpolation of a set of average voice models (AVM). Recent results have shown that the quality/naturalness of adapted voices depends on the distance from the average voice model used for speaker adaptation. This suggests the use of several AVMs trained on carefully chosen...

article

Complex cepstrum for statistical parametric speech synthesis

Ranniery Maia, Masami Akamine, Mark J.F. Gales

Speech Communication > 2013 > 55 > 5 > 606-618

Statistical parametric synthesizers have typically relied on a simplified model of speech production. In this model, speech is generated using a minimum-phase filter, implemented from coefficients derived from spectral parameters, driven by a zero or random phase excitation signal. This excitation signal is usually constructed from fundamental frequencies and parameters used to control the balance...

chapter

Training a supra-segmental parametric F0 model without interpolating F0

Javier Latorre, Mark J.F. Gales, Kate Knill, Masami Akamine

2013 IEEE International Conference on Acoustics, Speech and Signal Processing > 6880 - 6884

ICASSP 2013 - 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Combining multiple intonation models at different linguistic levels is an effective way to improve the naturalness of the predicted F0. In many of these approaches, the intonation models for suprasegmental levels are based on a parametrization of the log-F0 contours over the units of that level. However, many of these parametrisations are not stable when applied to discontinuous signals. Therefore,...

chapter

Integrated automatic expression prediction and speech synthesis from text

Langzhou Chen, Mark J.F. Gales, Norbert Braunschweiler, Masami Akamine, more

2013 IEEE International Conference on Acoustics, Speech and Signal Processing > 7977 - 7981

ICASSP 2013 - 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Getting a text to speech synthesis (TTS) system to speak lively animated stories like a human is very difficult. To generate expressive speech, the system can be divided into 2 parts: predicting expressive information from text; and synthesizing the speech with a particular expression. Traditionally these blocks have been studied separately. This paper proposes an integrated approach, sharing the...

chapter

Continuous F0 in the source-excitation generation for HMM-based TTS: Do we need voiced/unvoiced classification?

Javier Latorre, Mark J.F. Gales, Sabine Buchholz, Kate Knill, more

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4724 - 4727

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Most HMM-based TTS systems use a hard voiced/unvoiced classification to produce a discontinuous F0 signal which is used for the generation of the source-excitation. When a mixed source excitation is used, this decision can be based on two different sources of information: the state-specific MSD-prior of the F0 models, and/or the frame-specific features generated by the aperiodicity model. This paper...

Filter options

Publication date

Set your own date range

INFONA - science communication portal

Search results for: Mark J.F. Gales

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Data set

Reporting an error / abuse

Sending the report failed

Accessibility options