Advanced search

Advanced search in people

Items from 1 to 8 out of 8 results

chapter

Automatic optimization of data perturbation distributions for multi-style training in speech recognition

Mortaza Doulaty, Richard Rose, Olivier Siohan

2016 IEEE Spoken Language Technology Workshop (SLT) > 21 - 27

2016 IEEE Spoken Language Technology Workshop (SLT)

Speech recognition performance using deep neural network based acoustic models is known to degrade when the acoustic environment and the speaker population in the target utterances are significantly different from the conditions represented in the training data. To address these mismatched scenarios, multi-style training (MTR) has been used to perturb utterances in an existing uncorrupted and potentially...

chapter

Improving learning efficiency in multi-objective simulated annealing programming for sound environment classification

A. Cocana-Fernandez, L. Sanchez, J. Ranilla, R. Gil-Pita, more

2016 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM) > 1 - 5

2016 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM)

In this work, a classifier that jointly optimises the expected total classification cost and the energy consumption is presented. A numerical study is provided, where different alternatives are implemented on a hearing aid. Our proposal is capable of automatically classifying the acoustic environment that surrounds the user and choosing the parameters of the amplification that are best adapted to...

chapter

Initialization in speaker model training based on expectation maximization

Yihong Wang

2013 6th International Congress on Image and Signal Processing (CISP) > 3 > 1309 - 1313

2013 6th International Congress on Image and Signal Processing (CISP)

The optimized speaker model is trained by many time iterative algorithm based on expectation maximization (Abbr. EM). In the process, the choice of speaker model initial value has great influence on the final recognition effect. The most common algorithms which are used to choose the initial value are K-means algorithm and LBG algorithm at present, but the two algorithms belong to a sort of local...

chapter

A method of Chinese organization named entities recognition based on statistical word frequency, part of speech and length

Xiying Yao

2011 4th IEEE International Conference on Broadband Network and Multimedia Technology > 637 - 641

2011 4th IEEE International Conference on Broadband Network & Multimedia Technology (IC-BNMT 2011)

We propose a recognition method based on statistics through analysis the grammatical and semantic characteristics of the Chinese organization name. This recognition method includes three elements: frequency, part of speech, word length. We use the data in mature collection as training data; separately calculate a candidate organization name's word frequency, part of speech and word length of the contribution...

chapter

Point process models of spectro-temporal modulation events for speech recognition

Aren Jansen, Nima Mesgarani, Partha Niyogi

2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers > 104 - 108

2010 44th Asilomar Conference on Signals, Systems and Computers

Neurobiological research has uncovered the existence of cortical neurons in various animal species tuned to particular spectro-temporal modulations (STM) in the auditory stimulus. Other findings indicate that temporal statistics of the resulting neural spike trains may encode the underlying content of species-specific communication calls. With this motivation, we present an alternative approach to...

chapter

Romanian language statistics and resources for text-to-speech systems

Adriana Stan, Mircea Giurgiu

2010 9th International Symposium on Electronics and Telecommunications > 381 - 384

2010 9th International Symposium on Electronics and Telecommunications (ISETC 2010)

This paper introduces a series of results and experiments used in the development of a Romanian text-to-speech system, focusing on text statistics. We investigate the presence of several linguistic units used in text-to-speech systems, from phonemes to words. The text corpus we used, News-Romanian (News-RO) comprises 4500 newspaper articles. A subset of it, around 2500 sentences represents the Romanian...

chapter

The automatic prediction of Chinese text's prosodic structure based on tree structure

Yili Qian

2010 International Conference on Computer Application and System Modeling (ICCASM 2010) > 13 > V13-99 - V13-103

2010 International Conference on Computer Application and System Modeling (ICCASM 2010)

The recognition of prosodic structure is an important research aspect in the field of Text-to-Speech. It is essential to improving the naturalness of machine-synthesized speech. This paper proposes an approach to predicting and assigning prosodic structure automatically for Chinese sentences based on their tree structures. It presents the modeling of a statistical language model based on the simply...

chapter

Cross-validation based decision tree clustering for HMM-based TTS

Yu Zhang, Zhi-Jie Yan, F K Soong

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4602 - 4605

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

In HMM-based speech synthesis, we usually use complex, context dependent models to characterize prosodically and linguistically rich speech units. It is therefore difficult to prepare training data which can cover all combinatorial possibilities of contexts. A common approach to cope with this insufficient training data problem is to build a clustered tree via the MDL criterion. However, an MDL-based...

Filter options

Keywords:
TRAINING
SPEECH

Publication date

Set your own date range

Keywords

HIDDEN MARKOV MODELS (3)
SPEECH SYNTHESIS (3)
ACCURACY (2)
ACOUSTICS (2)
CONTEXT (2)
SOCIOLOGY (2)
SPEECH RECOGNITION (2)
TEXT-TO-SPEECH (2)
TRAINING DATA (2)
ACCENT POSITIONING (1)
ALGORITHM DESIGN AND ANALYSIS (1)
ANIMAL SPECIES (1)
AUDIO DATABASES (1)
AUDITORY STIMULUS (1)
AUTOMATIC PREDICTION (1)
AUTOMATIC SPEECH RECOGNITION (1)
BOUNDARY (1)
CHARACTER RECOGNITION (1)
CHINESE SENTENCE (1)
CHINESE TEXT PROSODIC STRUCTURE (1)
CLUSTERING ALGORITHMS (1)
COLONY ALGORITHM (1)
COMPUTATIONAL EFFICIENCY (1)
COMPUTER APPLICATIONS (1)
CONTEXT CLUSTERING (1)
CONTEXTS (1)
CONTRIBUTION (1)
CORTICAL NEURONS (1)
CORTICALLY-INSPIRED SPECTRO-TEMPORAL FILTER BANK (1)
CROSS VALIDATION (1)
CROSS-VALIDATION (1)
DATA MODELS (1)
DATA PERTURBATION (1)
DATABASES (1)
DECISION TREE CLUSTERING (1)
DECISION TREES (1)
DETECTORS (1)
ENERGY-EFFICIENCY (1)
EUROPE (1)
FILTERING ALGORITHMS (1)
GAUSSIAN MIXTURE MODEL (1)
GENERATION ERROR (1)
GENETIC ALGORITHMS (1)
HIGH TEMPERATURE SUPERCONDUCTORS (1)
HMM-BASED SPEECH SYNTHESIS (1)
HMM-BASED TTS (1)
LEXICON (1)
LINGUISTICALLY RICH SPEECH UNITS (1)
MACHINE LEARNING ALGORITHMS (1)
MACHINE-SYNTHESIZED SPEECH (1)
MAXIMAL ONSET PRINCIPLE (1)
MDL (1)
MDL CRITERION (1)
MEL FREQUENCY CEPSTRAL COEFFICIENT (1)
MODEL PARAMETERS (1)
MODELING (1)
MODULATION (1)
MULTI-STYLE TRAINING (1)
MULTIOBJECTIVE OPTIMIZATION (1)
NATURAL LANGUAGE PROCESSING (1)
NEURAL SPIKE TRAINS (1)
NEUROBIOLOGICAL RESEARCH (1)
NEWS-ROMANIAN (1)
NEWSPAPER ARTICLES (1)
ORGANIZATIONS (1)
PATTERN CLUSTERING (1)
PHONEMES (1)
PHONETIC TRANSCRIPTION (1)
POINT PROCESS MODELS (1)
PREDICTIVE MODELS (1)
PROSODIC BOUNDARIES (1)
PROSODIC STRUCTURE (1)
ROMANIAN (1)
ROMANIAN LANGUAGE STATISTICS (1)
ROMANIAN LEXICON (1)
ROMANIAN SPEECH SYNTHESIS RECORDED SPEECH DATABASE (1)
ROMANIAN SYLLABIFICATION (1)
ROMANIAN TEXT-TO-SPEECH SYSTEM (1)
SENTENCES (1)
SIGNAL PROCESSING ALGORITHMS (1)
SIMULATED ANNEALING PROGRAMMING (1)
SOUND ENVIRONMENT CLASSIFICATION (1)
SPECIES-SPECIFIC COMMUNICATION (1)
SPECTRO-TEMPORAL MODULATION EVENTS (1)
SPECTRO-TEMPORAL MODULATION FEATURES (1)
SPECTRO-TEMPORAL MODULATIONS (1)
STATISTICAL ANALYSIS (1)
STATISTICAL LANGUAGE MODEL (1)
STATISTICAL MODEL (1)
TEMPORAL STATISTICS (1)
TEXT ANALYSIS (1)
TEXT PROCESSING (1)
TEXT STATISTICS (1)
THE RECOGNITION OF CHINESE ORGANIZATION NAME (1)
TRAINING ALGORITHM (1)
TREE SEARCHING (1)
TREE STRUCTURE (1)
more

INFONA - science communication portal

Advanced search

Advanced search in people

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options