Search results

Items from 1 to 8 out of 8 results

chapter

Representing word image using visual word embeddings and RNN for keyword spotting on historical document images

Hongxi Wei, Hui Zhang, Guanglai Gao

2017 IEEE International Conference on Multimedia and Expo (ICME) > 1368 - 1373

2017 IEEE International Conference on Multimedia and Expo (ICME)

Visual words of Bag-of-Visual-Words (BoVW) framework are independent each other, which results in not only discarding spatial orders between visual words but also lacking semantic information. This study is inspired by word embeddings that a similar embedding procedure is applied to a large number of visual words. By this way, the corresponding embedding vectors of the visual words can be formulated...

chapter

End-to-end ASR-free keyword search from speech

Kartik Audhkhasi, Andrew Rosenberg, Abhinav Sethy, Bhuvana Ramabhadran, more

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4840 - 4844

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

sequence during training. This paper explores the design of an ASR-free end-to-end system for text query-based keyword search (KWS) from speech trained with minimal supervision. Our E2E KWS system consists of three sub-systems. The first sub-system is a recurrent neural network (RNN)-based acoustic auto-encoder trained to

article

End-to-End ASR-Free Keyword Search From Speech

Kartik Audhkhasi, Andrew Rosenberg, Abhinav Sethy, Bhuvana Ramabhadran, more

IEEE Journal of Selected Topics in Signal Processing > 2017 > 11 > 8 > 1351 - 1359

Conventional keyword search (KWS) systems for speech databases match the input text query to the set of word hypotheses generated by an automatic speech recognition (ASR) system from utterances in the database. Hence, such KWS systems attempt to solve the complex problem of ASR as a precursor. Training an ASR system

chapter

Efficient methods to train multilingual bottleneck feature extractors for low resource keyword search

Chongjia Ni, Cheung-Chi Leung, Lei Wang, Nancy F. Chen, more

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5650 - 5654

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Training a bottleneck feature (BNF) extractor with multilingual data has been common in low resource keyword search. In a low resource application, the amount of transcribed target language data is limited while there are usually plenty of multilingual data. In this paper, we investigated two methods to train

chapter

Robust discriminative keyword spotting for emotionally colored spontaneous speech using bidirectional LSTM networks

M. Wollmer, F. Eyben, J. Keshet, A. Graves, more

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 3949 - 3952

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

In this paper we propose a new technique for robust keyword spotting that uses bidirectional long short-term memory (BLSTM) recurrent neural nets to incorporate contextual information in speech decoding. Our approach overcomes the drawbacks of generative HMM modeling by applying a discriminative learning procedure

chapter

Spoken term detection with Connectionist Temporal Classification: A novel hybrid CTC-DBN decoder

Martin Wöllmer, Florian Eyben, Bjorn Schuller, Gerhard Rigoll

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 5274 - 5277

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

This paper proposes a novel system for robust keyword detection in continuous speech. Our decoder is composed of a bidirectional Long Short-Term Memory recurrent neural network using a Connectionist Temporal Classification (CTC) output layer, and a Dynamic Bayesian Network (DBN). The CTC network exploits bidirectional

chapter

An investigation into language model data augmentation for low-resourced STT and KWS

Guangpu Huang, Thiago Fraga da Silva, Lori Lamel, Jean-Luc Gauvain, more

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5790 - 5794

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper reports on investigations using two techniques for language model text data augmentation for low-resourced automatic speech recognition and keyword search. Lowresourced languages are characterized by limited training materials, which typically results in high out-of-vocabulary (OOV) rates and poor language

chapter

Applications of Recurrent Neural Network Language Model in Offline Handwriting Recognition and Word Spotting

Nan Li, Jinying Chen, Huaigu Cao, Bing Zhang, more

2014 14th International Conference on Frontiers in Handwriting Recognition > 134 - 139

2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR)

The recurrent neural network language model (RNNLM) is a discriminative, non-Markovian model that can capture long-span word history in natural language. It has been proved to be successful in automatic speech recognition and machine translation. In this work, we applied RNNLM to the n-best rescoring stage of the state-of-the-art BBN Byblos OCR (optical character recognition) system for handwriting...

Filter options

Keywords:
TRAINING
RECURRENT NEURAL NETWORKS

Publication date

Set your own date range

Publication type

book (7)
article (1)

Keywords

HIDDEN MARKOV MODELS (5)
KEYWORD SEARCH (4)
SPEECH (4)
KEYWORD SPOTTING (3)
SPEECH RECOGNITION (3)
AUTOMATIC SPEECH RECOGNITION (2)
DECODING (2)
END-TO-END SYSTEMS (2)
NEURAL NETWORKS (2)
OPTICAL CHARACTER RECOGNITION SOFTWARE (2)
RECURRENT NEURAL NETWORK (2)
SPEECH CODING (2)
VOCABULARY (2)
ABSTRACT VECTOR SPACE (1)
ACOUSTICS (1)
ARBITRARY SPEECH (1)
BAG-OF-VISUAL-WORDS (1)
BAYES METHODS (1)
BIDIRECTIONAL CONTEXT INFORMATION (1)
BIDIRECTIONAL LONG SHORT-TERM MEMORY RECURRENT NEURAL NETS (1)
BIDIRECTIONAL LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK (1)
BUILDINGS (1)
CHARACTER RECOGNITION (1)
CONNECTIONIST TEMPORAL CLASSIFICATION (1)
CONTEXT (1)
CONTINUOUS SPEECH (1)
CTC NETWORK (1)
CTC PHONEME OUTPUT STRING (1)
DATA MINING (1)
DATA MODELS (1)
DATABASES (1)
DYNAMIC BAYESIAN NETWORK (1)
DYNAMIC BAYESIAN NETWORKS (1)
ERROR ANALYSIS (1)
FEATURE EXTRACTION (1)
HANDWRITING RECOGNITION (1)
HIDDEN MARKOV MODEL (1)
HISTOGRAMS (1)
HISTORICAL DOCUMENT IMAGES (1)
HUMAINE SENSITIVE ARTIFICIAL LISTENER DATABASE (1)
HYBRID CTC-DBN DECODER (1)
IMAGE SEGMENTATION (1)
INFORMATION RETRIEVAL (1)
KEYWORD DETECTION (1)
LANGUAGE IDENTIFICATION (1)
LATTICES (1)
LOGIC GATES (1)
LOW RESOURCED LANGUAGES (1)
MULTILINGUAL (1)
MULTILINGUAL DATA SELECTION (1)
NIST (1)
OPTICAL CHARACTER RECOGNITION (1)
RECURRENT NEURAL NETS (1)
ROBUST DISCRIMINATIVE KEYWORD SPOTTING (1)
ROBUSTNESS (1)
SEMANTICS (1)
SIGNAL CLASSIFICATION (1)
SPEAKER RECOGNITION (1)
SPEECH DECODING (1)
SPEECH PROCESSING (1)
SPOKEN TERM DETECTION (1)
SYSTEM ARCHITECTURE (1)
TRAINING DATA (1)
VISUAL WORD EMBEDDINGS (1)
VISUALIZATION (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options