The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Neural network joint modeling (NNJM) has produced huge improvement in machine translation performance. As in standard neural network language modeling, a context-independent linear projection is applied to project a sparse input vector into a continuous representation at each word position. Because neighboring words are dependent on each other, context-independent projection may not be optimal. We...
Recurrent Neural Networks (RNN) have been successfully applied for improved speech recognition and statistical machine translation (SMT) for N-best list re-ranking. In SMT, we investigate using bilingual word-aligned sentences to train a bilingual recurrent neural network model. We employ a bag-of-word representation of a source sentence as additional input features in model training. Experimental...
Though sparse features have produced significant gains over traditional dense features in statistical machine translation, careful feature selection and feature engineering are necessary to avoid over-fitting in optimizations. However, many sparse features are highly overlapping with each other; that is, they cover the same or similar information of translational equivalence from slightly different...
This paper assesses the role of robust acoustic features in spoken term detection (a.k.a keyword spotting — KWS) under heavily degraded channel and noise corrupted conditions. A number of noise-robust acoustic features were used, both in isolation and in combination, to train large vocabulary continuous speech recognition (LVCSR) systems, with the resulting word lattices used for spoken term detection...
Detecting automatic speech recognition (ASR) errors can play an important role for effective human-computer spoken dialogue system, as recognition errors can hinder accurate system understanding of user intents. Our goal is to locate errors in an utterance so that the dialogue manager can pose appropriate clarification questions to the users. We propose two approaches to improve ASR error detection:...
We propose a new optimization algorithm called Generalized Baum Welch (GBW) algorithm for discriminative training on hidden Markov model (HMM). GBW is based on Lagrange relaxation on a transformed optimization problem. We show that both Baum-Welch (BW) algorithm for ML estimate of HMM parameters, and the popular extended Baum-Welch (EBW) algorithm for discriminative training are special cases of GBW...
The major limitation in bilingual latent semantic analysis (bLSA) is the requirement of parallel training corpora. Motivated by semi-supervised learning, we propose a clusterbased bLSA training approach to incorporate monolingual corpora. Treating each parallel document pair as centroids of the parallel document clusters, each monolingual document is associated to the closest centroid according to...
We propose a latent Dirichlet-tree allocation (LDTA) model - a correlated latent semantic model - for unsupervised language model adaptation. The LDTA model extends the latent Dirichlet allocation (LDA) model by replacing a Dirichlet prior with a Dirichlet-tree prior over the topic proportions. Latent topics under the same subtree are expected to be more correlated than topics under different subtrees...
We propose a novel approach to cross-lingual language model and translation lexicon adaptation for statistical machine translation (SMT) based on bilingual latent semantic analysis. Bilingual LSA enables latent topic distributions to be efficiently transferred across languages by enforcing a one-to-one topic correspondence during training. Using the proposed bilingual LSA framework, model adaptation...
Recently, Li et al. proposed a new auditory feature for robust speech recognition in noise environments. The new feature was derived by mimicking closely the function of human auditory process. Several filters were used to model the outer ear, middle ear, and cochlea, and the initial filter parameters and shapes were obtained from crude psychoacoustics results, experience, or experiments. Although...
During minimum-classification-error (MCE) training, competing hypotheses against the correct one are commonly derived by the N-best algorithm. One problem with the N-best algorithm is that, in practice, some misclassified data can have very large misclassification distances from the N-best competitors and fall out of the steep/trainable region of the sigmoid function, and thus cannot be utilized effectively...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.