Search results for: Vaibhava Goel

Items from 1 to 10 out of 10 results

chapter

Self-Critical Sequence Training for Image Captioning

Steven J. Rennie, Etienne Marcheret, Youssef Mroueh, Jerret Ross, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1179 - 1195

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Recently it has been shown that policy-gradient methods for reinforcement learning can be utilized to train deep end-to-end systems directly on non-differentiable metrics for the task at hand. In this paper we consider the problem of optimizing image captioning systems using reinforcement learning, and show that by carefully optimizing our systems using the test metrics of the MSCOCO task, significant...

chapter

Noisy objective functions based on the f-divergence

Markus Nussbaum-Thom, Ralf Schluter, Vaibhava Goel, Hermann Ney

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2327 - 2331

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Dropout, the random dropping out of activations according to a specified rate, is a very simple but effective method to avoid over-fitting of deep neural networks to the training data.

article

Data Augmentation for Deep Neural Network Acoustic Modeling

Xiaodong Cui, Vaibhava Goel, Brian Kingsbury

IEEE/ACM Transactions on Audio, Speech, and Language Processing > 2015 > 23 > 9 > 1469 - 1477

This paper investigates data augmentation for deep neural network acoustic modeling based on label-preserving transformations to deal with data sparsity. Two data augmentation approaches, vocal tract length perturbation (VTLP) and stochastic feature mapping (SFM), are investigated for both deep neural networks (DNNs) and convolutional neural networks (CNNs). The approaches are focused on increasing...

chapter

Annealed dropout trained maxout networks for improved LVCSR

Steven J. Rennie, Pierre L. Dognin, Xiaodong Cui, Vaibhava Goel

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5181 - 5185

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

A significant barrier to progress in automatic speech recognition (ASR) capability is the empirical reality that techniques rarely “scale”—the yield of many apparently fruitful techniques rapidly diminishes to zero as the training criterion or decoder is strengthened, or the size of the training set is increased. Recently we showed that annealed dropout—a regularization procedure which gradually reduces...

chapter

Evaluating Deep Scattering Spectra with deep neural networks on large scale spontaneous speech task

Petr Fousek, Pierre Dognin, Vaibhava Goel

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4550 - 4554

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Deep Scattering Network features introduced for image processing have recently proved useful in speech recognition as an alternative to log-mel features for Deep Neural Network (DNN) acoustic models. Scattering features use wavelet decomposition directly producing log-frequency spectrograms which are robust to local time warping and provide additional information within higher order coefficients....

chapter

Deep multimodal learning for Audio-Visual Speech Recognition

Youssef Mroueh, Etienne Marcheret, Vaibhava Goel

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2130 - 2134

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper, we present methods in deep multimodal learning for fusing speech and visual modalities for Audio-Visual Automatic Speech Recognition (AV-ASR). First, we study an approach where uni-modal deep networks are trained separately and their final hidden layers fused to obtain a joint feature space in which another deep network is built. While the audio network alone achieves a phone error...

chapter

A framework for unsupervised transfer learning and application to dialog decision classification

Etienne Marcheret, Om D Deshmukh, Vaibhava Goel, Jiri Navratil

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 1981 - 1984

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

We propose a framework for transfer learning in the unsupervised condition, and show its usefulness in addressing the problem of mismatch in test time dialog state decision classifier, which is presented here as a binary hypothesis problem. We are asked to either accept or reject the ASR output. The framework encompasses a two step process, the first step culminates in the discriminative retraining...

chapter

Affine invariant sparse maximum a posteriori adaptation

Peder A. Olsen, Jing Huang, Steven J. Rennie, Vaibhava Goel

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4317 - 4320

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

Modern speech applications utilize acoustic models with billions of parameters, and serve millions of users. Storing an acoustic model for each user is costly. We show through the use of sparse regularization, that it is possible to obtain competitive adaptation performance by changing only a small fraction of the parameters of an acoustic model. This allows for the compression of speaker-dependent...

chapter

Exemplar-based Sparse Representation phone identification features

Tara N. Sainath, David Nahamoo, Bhuvana Ramabhadran, Dimitri Kanevsky, more

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4492 - 4495

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Exemplar-based techniques, such as k-nearest neighbors (kNNs) and Sparse Representations (SRs), can be used to model a test sample from a few training points in a dictionary set. In past work, we have shown that using a SR approach for phonetic classification allows for a higher accuracy than other classification techniques. These phones are the basic units of speech to be recognized. Motivated by...

chapter

Discriminative training for full covariance models

Peder A. Olsen, Vaibhava Goel, Steven J. Rennie

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5312 - 5315

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper we revisit discriminative training of full covariance acoustic models for automatic speech recognition. One of the difficult aspects of discriminative training is how to set the constant D that appears in the parameter updates. For diagonal covariance models, this constant D is set based on knowing the smallest value of D, D*, for which the resulting covariances remain positive definite...

Filter options

Keywords:
TRAINING

Publication date

Set your own date range

Publication type

book (9)
article (1)

Keywords

ACOUSTICS (6)
SPEECH RECOGNITION (5)
DEEP NEURAL NETWORKS (4)
HIDDEN MARKOV MODELS (4)
SPEECH (4)
NEURAL NETWORKS (3)
TRAINING DATA (3)
DATA MODELS (2)
ERROR ANALYSIS (2)
FEATURE EXTRACTION (2)
ACCURACY (1)
ADAPTATION MODELS (1)
ANNEALING (1)
ASR CONFIDENCE (1)
AUDIO-VISUAL AUTOMATIC SPEECH RECOGNITION (AV-ASR) (1)
AUTOMATIC SPEECH RECOGNITION (1)
BAYES ERROR RESHAPING (1)
BAYESIAN METHODS (1)
BAYESIAN PRIOR (1)
COMPUTATIONAL MODELING (1)
CORRELATION (1)
COVARIATE SHIFT (1)
DATA AUGMENTATION (1)
DECISION SUPPORT SYSTEMS (1)
DEEP SCATTERING NETWORKS (1)
DETERMINISTIC ANNEALING (1)
DICTIONARIES (1)
DISCRIMINATIVE TRAINING (1)
DROPOUT (1)
DROPOUT TRAINING (1)
EIGENVALUES AND EIGENFUNCTIONS (1)
ELASTIC NET (1)
ENTROPY (1)
EQUATIONS (1)
F-DIVERGENCE (1)
FULL COVARIANCE MODELING (1)
GAUSSIAN DISTRIBUTION (1)
GENERALIZATION ERROR BOUNDS (1)
GRAMMAR (1)
INFERENCE ALGORITHMS (1)
JOINTS (1)
KEYWORD SEARCH (1)
LEARNING (ARTIFICIAL INTELLIGENCE) (1)
LINEAR PROGRAMMING (1)
LOGIC GATES (1)
MATHEMATICAL MODEL (1)
MAXIMUM MUTUAL INFORMATION (1)
MAXOUT NETWORKS (1)
MCE (1)
MEASUREMENT (1)
MODEL AGGREGATION (1)
MULTIMODAL LEARNING (1)
NOISE MEASUREMENT (1)
NON-SMOOTH OPTIMIZATION (1)
OPTIMIZATION (1)
PREDICTIVE MODELS (1)
QUADRATIC EIGENVALUE PROBLEM (1)
SCATTERING (1)
SCHEDULES (1)
SEQUENCE TRAINING CRITERION (1)
SPARSE REPRESENTATIONS (1)
SPONTANEOUS SPEECH (1)
STOCHASTIC FEATURE MAPPING (1)
STRONTIUM (1)
TOPOLOGY (1)
TRAINING CRITERIA (1)
UNSUPERVISED TRANSFER LEARNING (1)
VISUALIZATION (1)
more

INFONA - science communication portal

Search results for: Vaibhava Goel

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options