2014 IEEE Spoken Language Technology Workshop (SLT)

Items from 1 to 20 out of 111 results

book

2014 IEEE Spoken Language Technology Workshop (SLT)

IEEE

2014 IEEE Spoken Language Technology Workshop (SLT)

chapter

Joint decoding of complementary utterances

Mickael Rouvier, Benoit Favre, Frederic Bechet

2014 IEEE Spoken Language Technology Workshop (SLT) > 212 - 217

2014 IEEE Spoken Language Technology Workshop (SLT)

Errors in open-domain ASR can be corrected by asking the speaker to rephrase targeted segments in utterances where they have been detected. The utterance merging problem consists in generating a better transcript from the utterance where errors have been detected and a clarification utterance. We introduce an alignment-decoding algorithm for jointly processing the two utterances and benefit from the...

chapter

A discriminative model based entity dictionary weighting approach for spoken language understanding

Xiaohu Liu, Ruhi Sarikaya

2014 IEEE Spoken Language Technology Workshop (SLT) > 195 - 199

2014 IEEE Spoken Language Technology Workshop (SLT)

Spoken language understanding (SLU) systems use various features to detect the domain, intent and semantic slots of a query. In addition to n-grams, features generated from entity dictionaries are often used in model training. Clean or properly weighted dictionaries are critical to improve model's coverage and accuracy for unseen entities during test time. However, clean dictionaries are hard to obtain...

chapter

Bayesian recurrent neural network language model

Jen-Tzung Chien, Yuan-Chu Ku

2014 IEEE Spoken Language Technology Workshop (SLT) > 206 - 211

2014 IEEE Spoken Language Technology Workshop (SLT)

This paper presents a Bayesian approach to construct the recurrent neural network language model (RNN-LM) for speech recognition. Our idea is to regularize the RNN-LM by compensating the uncertainty of the estimated model parameters which is represented by a Gaussian prior. The objective function in Bayesian RNN (BRNN) is formed as the regularized cross entropy error function. The regularized model...

chapter

Spoken language understanding using long short-term memory neural networks

Kaisheng Yao, Baolin Peng, Yu Zhang, Dong Yu, more

2014 IEEE Spoken Language Technology Workshop (SLT) > 189 - 194

2014 IEEE Spoken Language Technology Workshop (SLT)

Neural network based approaches have recently produced record-setting performances in natural language understanding tasks such as word labeling. In the word labeling task, a tagger is used to assign a label to each word in an input sequence. Specifically, simple recurrent neural networks (RNNs) and convolutional neural networks (CNNs) have shown to significantly outperform the previous state-of-the-art...

chapter

Voice conversion using deep neural networks with speaker-independent pre-training

Seyed Hamidreza Mohammadi, Alexander Kain

2014 IEEE Spoken Language Technology Workshop (SLT) > 19 - 23

2014 IEEE Spoken Language Technology Workshop (SLT)

In this study, we trained a deep autoencoder to build compact representations of short-term spectra of multiple speakers. Using this compact representation as mapping features, we then trained an artificial neural network to predict target voice features from source voice features. Finally, we constructed a deep neural network from the trained deep autoencoder and artificial neural network weights,...

chapter

Semantic language models for Automatic Speech Recognition

Ali Orkan Bayer, Giuseppe Riccardi

2014 IEEE Spoken Language Technology Workshop (SLT) > 7 - 12

2014 IEEE Spoken Language Technology Workshop (SLT)

We are interested in the problem of semantics-aware training of language models (LMs) for Automatic Speech Recognition (ASR). Traditional language modeling research have ignored semantic constraints and focused on limited size histories of words. Semantic structures may provide information to capture lexically realized long-range dependencies as well as the linguistic scene of a speech utterance....

chapter

Incremental translation using hierarchichal phrase-based translation system

Maryam Siahbani, Ramtin Mehdizadeh Seraj, Baskaran Sankaran, Anoop Sarkar

2014 IEEE Spoken Language Technology Workshop (SLT) > 71 - 76

2014 IEEE Spoken Language Technology Workshop (SLT)

Hierarchical phrase-based machine translation [1] (Hiero) is a prominent approach for Statistical Machine Translation usually comparable to or better than conventional phrase-based systems. But Hiero typically uses the CKY decoding algorithm which requires the entire input sentence before decoding begins, as it produces the translation in a bottom-up fashion. Left-to-right (LR) decoding [2] is a promising...

chapter

Data collection and language understanding of food descriptions

Mandy Korpusik, Nicole Schmidt, Jennifer Drexler, Scott Cyphers, more

2014 IEEE Spoken Language Technology Workshop (SLT) > 560 - 565

2014 IEEE Spoken Language Technology Workshop (SLT)

This paper presents initial data collection and language understanding experiments conducted as part of a larger effort to create a nutrition dialogue system that automatically extracts food concepts from a user's spoken meal description. We first summarize the data collection and annotation of food descriptions performed via Amazon Mechanical Turk. We then present semantic labeling experiments using...

chapter

The use of discriminative belief tracking in POMDP-based dialogue systems

Dongho Kim, Matthew Henderson, Milica Gasic, Pirros Tsiakoulis, more

2014 IEEE Spoken Language Technology Workshop (SLT) > 354 - 359

2014 IEEE Spoken Language Technology Workshop (SLT)

Statistical spoken dialogue systems based on Partially Observable Markov Decision Processes (POMDPs) have been shown to be more robust to speech recognition errors bymaintaining a belief distribution over multiple dialogue states and making policy decisions based on the entire distribution rather than the single most likely hypothesis. To date most POMDPbased systems have used generative trackers...

chapter

A word-level token-passing decoder for subword n-gram LVCSR

Matti Varjokallio, Mikko Kurimo

2014 IEEE Spoken Language Technology Workshop (SLT) > 495 - 500

2014 IEEE Spoken Language Technology Workshop (SLT)

The decoder is a key component of any modern speech recognizer. Morphologically rich languages pose special challenges for the decoder design, as a very large recognition vocabulary is required to avoid high out-of-vocabulary (OOV) rates. To alleviate these issues, the n-gram models are often trained over subwords instead of words. A subword n-gram model is able to assign probabilities to unseen word...

chapter

Combining local and broad topic context to improve term detection

Jonathan Wintrode, Sanjeev Khudanpur

2014 IEEE Spoken Language Technology Workshop (SLT) > 442 - 447

2014 IEEE Spoken Language Technology Workshop (SLT)

We aim to improve term detection performance by augmenting traditional N-gram language models with multiple levels of topic context. We demonstrate that incorporating complementary aspects of topicality leads to significant improvements in term detection accuracy. We represent broad topic context through document-specific latent topics inferred via a Bayesian topic model. We capture local topic context...

chapter

A multimodal stroke-based predictive input for efficient Chinese text entry on mobile devices

Khe Chai Sim

2014 IEEE Spoken Language Technology Workshop (SLT) > 448 - 453

2014 IEEE Spoken Language Technology Workshop (SLT)

Handwriting input method is particularly useful for languages with a logographic writing system. This paper introduces a multimodal stroked-based predictive input for the Chinese language. The proposed method requires users to write only the first few strokes of each character and the system will intelligently infer the intended characters by making use of contextual information. Specifically, a statistical...

chapter

Dysarthric vocal interfaces with minimal training data

Jort F. Gemmeke, Siddharth Sehgal, Stuart Cunningham, Hugo Van hamme

2014 IEEE Spoken Language Technology Workshop (SLT) > 248 - 253

2014 IEEE Spoken Language Technology Workshop (SLT)

Over the past decade, several speech-based electronic assistive technologies (EATs) have been developed that target users with dysarthric speech. These EATs include vocal command & control systems, but also voice-input voice-output communication aids (VIVOCAs). In these systems, the vocal interfaces are based on automatic speech recognition systems (ASR), but this approach requires much training...

chapter

Deriving local relational surface forms from dependency-based entity embeddings for unsupervised spoken language understanding

Yun-Nung Chen, Dilek Hakkani-Tur, Gokan Tur

2014 IEEE Spoken Language Technology Workshop (SLT) > 242 - 247

2014 IEEE Spoken Language Technology Workshop (SLT)

Recent works showed the trend of leveraging web-scaled structured semantic knowledge resources such as Freebase for open domain spoken language understanding (SLU). Knowledge graphs provide sufficient but ambiguous relations for the same entity, which can be used as statistical background knowledge to infer possible relations for interpretation of user utterances. This paper proposes an approach to...

chapter

Effective data-driven feature learning for detecting name errors in automatic speech recognition

Ji He, Alex Marin, Mari Ostendorf

2014 IEEE Spoken Language Technology Workshop (SLT) > 230 - 235

2014 IEEE Spoken Language Technology Workshop (SLT)

This paper addresses the problem of detecting name errors in automatic speech recognition (ASR) output. The highly skewed label distributions (i.e. name errors are infrequent), sparse training data, and large number of potential lexical features pose significant challenges for training name error classification systems. Data-driven feature learning is needed for handling multiple languages but is...

chapter

Computational analysis of trajectories of linguistic development in autism

Emily Prud'hommeaux, Eric Morley, Masoud Rouhizadeh, Laura Silverman, more

2014 IEEE Spoken Language Technology Workshop (SLT) > 266 - 271

2014 IEEE Spoken Language Technology Workshop (SLT)

Deficits in semantic and pragmatic expression are among the hallmark linguistic features of autism. Recent work in deriving computational correlates of clinical spoken language measures has demonstrated the utility of automated linguistic analysis for characterizing the language of children with autism. Most of this research, however, has focused either on young children still acquiring language or...

chapter

Automatic selection of speakers for improved acoustic modelling: recognition of disordered speech with sparse data

H. Christensen, I. Casanueva, S. Cunningham, P. Green, more

2014 IEEE Spoken Language Technology Workshop (SLT) > 254 - 259

2014 IEEE Spoken Language Technology Workshop (SLT)

The automatic recognition of disordered speech is a domain that is characterised by limited amounts of training data for each speaker and large intra- and inter-speaker variations. This paper is concerned with how best to train an acoustic models in these circumstances; in particular, we look at how to select data for a background model from a pool of speakers for a given target speaker. We show that...

chapter

Domain invariant speech features using a new divergence measure

Alan Wisler, Visar Berisha, Julie Liss, Andreas Spanias

2014 IEEE Spoken Language Technology Workshop (SLT) > 77 - 82

2014 IEEE Spoken Language Technology Workshop (SLT)

Existing speech classification algorithms often perform well when evaluated on training and test data drawn from the same distribution. In practice, however, these distributions are not always the same. In these circumstances, the performance of trained models will likely decrease. In this paper, we discuss an underutilized divergence measure and derive an estimable upper bound on the test error rate...

chapter

Phonetics embedding learning with side information

Gabriel Synnaeve, Thomas Schatz, Emmanuel Dupoux

2014 IEEE Spoken Language Technology Workshop (SLT) > 106 - 111

2014 IEEE Spoken Language Technology Workshop (SLT)

We show that it is possible to learn an efficient acoustic model using only a small amount of easily available word-level similarity annotations. In contrast to the detailed phonetic labeling required by classical speech recognition technologies, the only information our method requires are pairs of speech excerpts which are known to be similar (same word) and pairs of speech excerpts which are known...

Publication date

Set your own date range

Content availability

Available (110)
None (1)

Keywords

SPEECH RECOGNITION (12)
DEEP NEURAL NETWORKS (10)
SPOKEN LANGUAGE UNDERSTANDING (7)
DEEP NEURAL NETWORK (6)
DIALOG STATE TRACKING (5)
DIALOG SYSTEMS (4)
KEYWORD SEARCH (4)
RECURRENT NEURAL NETWORKS (4)
SPOKEN TERM DETECTION (4)
AUTOMATIC SPEECH RECOGNITION (3)
MACHINE LEARNING (3)
PROSODY (3)
ROBUST SPEECH RECOGNITION (3)
SLOT FILLING (3)
SPEAKER VERIFICATION (3)
ACOUSTIC MODEL (2)
ACOUSTIC MODELING (2)
BELIEF TRACKING (2)
DIALOGUE MANAGEMENT (2)
DISTRIBUTIONAL SEMANTICS (2)
DOMAIN ADAPTATION (2)
FRAME SEMANTICS (2)
I-VECTOR (2)
KALDI (2)
LANGUAGE MODEL (2)
MAXOUT NETWORKS (2)
NON-NEGATIVE MATRIX FACTORISATION (2)
POMDP (2)
QUERY CLICK LOGS (2)
RELATION DETECTION (2)
SPEAKER ADAPTATION (2)
SPEAKER DIARIZATION (2)
SPEECH (2)
SPEECH ANALYSIS (2)
SPOKEN DIALOG SYSTEMS (2)
SPOKEN DIALOGUE SYSTEMS (2)
SPOKEN LANGUAGE UNDERSTANDING (SLU) (2)
SYSTEM COMBINATION (2)
TOPIC MODELS (2)
VOICE CONVERSION (2)
ABX (1)
ACOUSTIC (1)
ACOUSTIC BACKGROUND (1)
ADAPTATION OF NEURAL NETWORKS (1)
ADOLESCENT (1)
AIR TRAFFIC CONTROL (1)
ANNEALED DROPOUT (1)
ANNOTATION (1)
ARABIC (1)
ARTICULATORY DATA (1)
ARTIFICIAL NEURAL NETWORKS (1)
ASR ERROR CORRECTION (1)
ASR SYSTEM (1)
ASSISTANCE SYSTEMS (1)
AUDIO ANALYSIS (1)
AUDIO CLASSIFICATION (1)
AUDIO USER INTERFACES (1)
AUTHOR-TOPIC MODEL (1)
AUTOENCODER (1)
AUTOMATED NEUROLOGICAL ASSESSMENT (1)
AUTOMATED SCORING (1)
AUTOMATIC ASSESSMENT (1)
AUTOMATIC RELEVANCE DETERMINATION (ARD) (1)
AUXILIARY TASKS (1)
BAYESIAN DECISION THEORY (1)
BAYESIAN LEARNING (1)
BILINGUAL RECURRENT NEURAL NETWORK MODEL (1)
BOTTLE-NECK NEURAL NETWORKS (1)
BOTTLENECK FEATURES (1)
BROADCAST DATA (1)
CANDIDATE LIST (1)
CANDIDATE SELECTION (1)
CHILD-DIRECTED SPEECH (1)
CHILDREN'S SPEECH RECOGNITION (1)
CLASSIFICATION (1)
CLINICAL SPOKEN LANGUAGE ANALYSIS (1)
CLUSTERING (1)
COLOR (1)
CONDITIONAL RANDOM FIELD (1)
CONFIDENCE SELECTION (1)
CONTEXT (1)
CONTINUOUS DYNAMIC PROGRAMMING (1)
CONTINUOUS SPEECH (1)
CONVERSATIONAL SYSTEMS (1)
CONVERTED SPEECH DETECTION (1)
CORPUS ANNOTATION (1)
COST-SENSITIVE ANNOTATION (1)
COUPLED DICTIONARIES (1)
CRF (1)
CROSS-DOMAIN GENERALIZATION (1)
CROSS-LINGUAL (1)
CROWDSOURCING (1)
DATA COLLECTION (1)
DECODING (1)
DEEP CONVOLUTIONAL NETWORKS (1)
DEEP NEURAL NETWORKS (DNN) (1)
DEPENDENCY STRUCTURES (1)
DESCRIPTIVE USER QUERIES (1)
DETERMINISTIC ANNEALING (1)
DIALOG BASED COMPUTER ASSISTED LANGUAGE LEARNING (1)
more

INFONA - science communication portal

2014 IEEE Spoken Language Technology Workshop (SLT)