Search results for: Guoguo Chen

Items from 1 to 10 out of 10 results

chapter

Highway long short-term memory RNNS for distant speech recognition

Yu Zhang, Guoguo Chen, Dong Yu, Kaisheng Yaco, more

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5755 - 5759

ICASSP 2016 - 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper, we extend the deep long short-term memory (DL-STM) recurrent neural networks by introducing gated direct connections between memory cells in adjacent layers. These direct links, called highway connections, enable unimpeded information flow across different layers and thus alleviate the gradient vanishing problem when building deeper LSTMs. We further introduce the latency-controlled...

chapter

Deep beamforming networks for multi-channel speech recognition

Xiong Xiao, Shinji Watanabe, Hakan Erdogan, Liang Lu, more

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5745 - 5749

ICASSP 2016 - 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Despite the significant progress in speech recognition enabled by deep neural networks, poor performance persists in some scenarios. In this work, we focus on far-field speech recognition which remains challenging due to high levels of noise and reverberation in the captured speech signals. We propose to represent the stages of acoustic processing including beamforming, feature extraction, and acoustic...

chapter

Acoustic data-driven pronunciation lexicon generation for logographic languages

Guoguo Chen, Daniel Povey, Sanjeev Khudanpur

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5350 - 5354

ICASSP 2016 - 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Handcrafted pronunciation lexicons are widely used in modern speech recognition systems. Designing a pronunciation lexicon, however, requires tremendous amount of expert knowledge and effort, which is not practical when applying speech recognition techniques to low resource languages. In this paper, we are interested in developing speech recognition systems for logographic languages with only a small...

chapter

JHU ASpIRE system: Robust LVCSR with TDNNS, iVector adaptation and RNN-LMS

Vijayaditya Peddinti, Guoguo Chen, Vimal Manohar, Tom Ko, more

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) > 539 - 546

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)

Multi-style training, using data which emulates a variety of possible test scenarios, is a popular approach towards robust acoustic modeling. However acoustic models capable of exploiting large amounts of training data in a comparatively short amount of training time are essential. In this paper we tackle the problem of reverberant speech recognition using 5500 hours of simulated reverberant data...

chapter

Librispeech: An ASR corpus based on public domain audio books

Vassil Panayotov, Guoguo Chen, Daniel Povey, Sanjeev Khudanpur

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5206 - 5210

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper introduces a new corpus of read English speech, suitable for training and evaluating speech recognition systems. The LibriSpeech corpus is derived from audiobooks that are part of the LibriVox project, and contains 1000 hours of speech sampled at 16 kHz. We have made the corpus freely available for download, along with separately prepared language-model training data and pre-built language...

chapter

Query-by-example keyword spotting using long short-term memory networks

Guoguo Chen, Carolina Parada, Tara N. Sainath

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5236 - 5240

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We present a novel approach to query-by-example keyword spotting (KWS) using a long short-term memory (LSTM) recurrent neural network-based feature extractor. In our approach, we represent each keyword using a fixed-length feature vector obtained by running the keyword audio through a word-based LSTM acoustic model. We use the activations prior to the softmax layer of the LSTM as our keyword-vector...

chapter

A keyword search system using open source software

Jan Trmal, Guoguo Chen, Dan Povey, Sanjeev Khudanpur, more

2014 IEEE Spoken Language Technology Workshop (SLT) > 530 - 535

2014 IEEE Spoken Language Technology Workshop (SLT)

Provides an overview of a speech-to-text (STT) and keyword search (KWS) system architecture build primarily on the top of the Kaldi toolkit and expands on a few highlights. The system was developed as a part of the research efforts of the Radical team while participating in the IARPA Babel program. Our aim was to develop a general system pipeline which could be easily and rapidly deployed in any language,...

chapter

Small-footprint keyword spotting using deep neural networks

Guoguo Chen, Carolina Parada, Georg Heigold

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4087 - 4091

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Our application requires a keyword spotting system with a small memory footprint, low computational cost, and high precision. To meet these requirements, we propose a simple approach based on deep neural networks. A deep neural network is trained to directly predict the keyword(s) or subword units of the keyword(s) followed by a posterior handling method producing a final confidence score. Keyword...

chapter

Using proxies for OOV keywords in the keyword search task

Guoguo Chen, Oguz Yilmaz, Jan Trmal, Daniel Povey, more

2013 IEEE Workshop on Automatic Speech Recognition and Understanding > 416 - 421

2013 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)

We propose a simple but effective weighted finite state transducer (WFST) based framework for handling out-of-vocabulary (OOV) keywords in a speech search task. State-of-the-art large vocabulary continuous speech recognition (LVCSR) and keyword search (KWS) systems are developed for conversational telephone speech in Tagalog. Word-based and phone-based indexes are created from word lattices, the latter...

chapter

Quantifying the value of pronunciation lexicons for keyword search in lowresource languages

Guoguo Chen, Sanjeev Khudanpur, Daniel Povey, Jan Trmal, more

2013 IEEE International Conference on Acoustics, Speech and Signal Processing > 8560 - 8564

ICASSP 2013 - 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper quantifies the value of pronunciation lexicons in large vocabulary continuous speech recognition (LVCSR) systems that support keyword search (KWS) in low resource languages. State-of-the-art LVCSR and KWS systems are developed for conversational telephone speech in Tagalog, and the baseline lexicon is augmented via three different grapheme-to-phoneme models that yield increasing coverage...

Filter options

Publication date

Set your own date range

Keywords

SPEECH RECOGNITION (6)
KEYWORD SEARCH (4)
DEEP NEURAL NETWORKS (2)
ACOUSTICS (1)
BIOINFORMATICS (1)
BLOGS (1)
CNTK (1)
COMPUTATIONAL MODELING (1)
CORPUS (1)
DEEP NEURAL NETWORK (1)
DIRECTION OF ARRIVAL (1)
ELECTRONIC PUBLISHING (1)
EMBEDDED SPEECH RECOGNITION (1)
FAR FIELD SPEECH RECOGNITION (1)
FEATURE EXTRACTION (1)
FILTER- AND-SUM BEAMFORMING (1)
GENOMICS (1)
HIDDEN MARKOV MODELS (1)
HIGHWAY LSTM (1)
IARPA BABEL (1)
INFORMATION RETRIEVAL (1)
INFORMATION SERVICES (1)
IVECTORS (1)
KALDI (1)
KEYWORD SPOTTING (1)
LIBRIVOX (1)
LOGOGRAPHIC LANGUAGE (1)
LOW RESOURCE LVCSR (1)
LSTM (1)
MICROPHONE ARRAYS (1)
MORPHOLOGY (1)
NOISE (1)
OOV KEYWORDS (1)
OPENKWS (1)
PITCH (1)
PRONUNCIATION LEXICON (1)
PROXY KEYWORDS (1)
RECURRENT NEURAL NETWORK LANGUAGE MODELS (1)
RESOURCE DESCRIPTION FRAMEWORK (1)
SEQUENCE TRAINING (1)
SPEECH (1)
SPEECH PROCESSING (1)
SPEECH SYNTHESIS (1)
SPOKEN TERM DETECTION (1)
TIME DELAY NEURAL NETWORKS (1)
more

INFONA - science communication portal

Search results for: Guoguo Chen

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options