Search results

Items from 1 to 6 out of 6 results

article

Modelling Semantic Context of OOV Words in Large Vocabulary Continuous Speech Recognition

Imran Sheikh, Dominique Fohr, Irina Illina, Georges Linares

IEEE/ACM Transactions on Audio, Speech, and Language Processing > 2017 > 25 > 3 > 598 - 610

The diachronic nature of broadcast news data leads to the problem of out-of-vocabulary (OOV) words in large vocabulary continuous speech recognition (LVCSR) systems. Analysis of OOV words reveals that a majority of them are proper names (PNs). However, PNs are important for automatic indexing of audio–video content and for obtaining reliable automatic transcriptions. In this paper, we focus on the...

chapter

WEFEST: Word Embedding Feature Extension for Short Text Classification

Lei Sang, Fei Xie, Xiaojian Liu, Xindong Wu

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) > 677 - 683

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

Short text classification is a crucial task for information retrieval, social medial text categorization, and many other applications. In reality, due to the inherent sparsity and the limited information available in the short texts, learning and classifying short texts is a significant challenge. In this paper, we propose a new framework, WEFEST, which expands short texts using word embedding for...

chapter

Pronunciation modeling of loanwords for Korean ASR using phonological knowledge and syllable-based segmentation

Hyuksu Ryu, Minsu Na, Minhwa Chung

2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 430 - 435

2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

This paper aims to improve the performance of automatic pronunciation generation of foreign loanwords in Korean by using phonological knowledge and syllable-based segmentation. The loanword text corpus used for our experiment consists of 16.6K words extracted from the frequently used words in set-top box, music, and POI domains. At first, pronunciations of loanwords in Korean are obtained by manual...

chapter

Open-Lexicon Language Modeling Combining Word and Character Levels

Michal Kozielski, Martin Matysiak, Patrick Doetsch, Ralf Schloter, more

2014 14th International Conference on Frontiers in Handwriting Recognition > 343 - 348

2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR)

In this paper we investigate different n-gram language models that are defined over an open lexicon. We introduce a character-level language model and combine it with a standard word-level language model in a back off fashion. The character-level language model is redefined and renormalized to assign zero probability to words from a fixed vocabulary. Furthermore we present a way to interpolate language...

chapter

Signature Matching Using Supervised Topic Models

Xianzhi Du, David Doermann, Wael Abd-Almageed

2014 22nd International Conference on Pattern Recognition > 327 - 332

2014 22nd International Conference on Pattern Recognition (ICPR)

In this paper, we present a novel signature matching method based on supervised topic models. Shape Context features are extracted from signature shape contours which capture the local variations in signature properties. We then use the concept of topic models to learn the shape context features which correspond to individual authors. The approach consists of three primary steps. First, K-means is...

chapter

Structured Output Layer neural network language model

Hai-Son Le, Ilya Oparin, Alexandre Allauzen, Jean-Luc Gauvain, more

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5524 - 5527

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper introduces a new neural network language model (NNLM) based on word clustering to structure the output vocabulary: Structured Output Layer NNLM. This model is able to handle vocabularies of arbitrary size, hence dispensing with the design of short-lists that are commonly used in NNLMs. Several softmax layers replace the standard output layer in this model. The output structure depends on...

Filter options

Data set:
ieee
Keywords:
COMPUTATIONAL MODELING
CONTEXT
TRAINING
VOCABULARY

Publication date

Set your own date range

Publication type

book (5)
article (1)

Keywords

SPEECH RECOGNITION (3)
SEMANTICS (2)
ACCURACY (1)
ADAPTATION MODELS (1)
ARTIFICIAL NEURAL NETWORKS (1)
AUTOMATIC SPEECH RECOGNITION (1)
CONTEXT MODELING (1)
DICTIONARIES (1)
FEATURE EXTRACTION (1)
IMAGE RETRIEVAL (1)
INTERPOLATION (1)
LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION (1)
NAVIGATION (1)
NEURAL NETWORK LANGUAGE MODEL (1)
NEURAL NETWORKS (1)
OUT-OF-VOCABULARY (1)
PROPER NAMES (1)
SEARCH ENGINES (1)
SEMANTIC CONTEXT (1)
SHAPE (1)
SIGNATURE MATCHING (1)
SPEECH-TO-TEXT (1)
STANDARDS (1)
SUPERVISED TOPIC MODEL (1)
TV (1)
more

INFONA - science communication portal

Search results

Modelling Semantic Context of OOV Words in Large Vocabulary Continuous Speech Recognition

WEFEST: Word Embedding Feature Extension for Short Text Classification

Pronunciation modeling of loanwords for Korean ASR using phonological knowledge and syllable-based segmentation

Open-Lexicon Language Modeling Combining Word and Character Levels

Signature Matching Using Supervised Topic Models

Structured Output Layer neural network language model

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options