Search results

Items from 141 to 160 out of 378 results

1 ...
5
6
7
8
9
10
11

chapter

Efficient integration of translation and speech models in dictation based machine aided human translation

Luis Rodriguez, Aarthi Reddy, Richard Rose

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4949 - 4952

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper is concerned with combining models for decoding an optimum translation for a dictation based machine aided human translation (MAHT) task. Statistical language model (SLM) probabilities in automatic speech recognition (ASR) are updated using statistical machine translation (SMT) model probabilities. The effect of this procedure is evaluated for utterances from human translators dictating...

chapter

Improving language models for ASR using translated in-domain data

Stefan Kombrink, Tomas Mikolov, Martin Karafiat, Lukas Burget

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4405 - 4408

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

Acquisition of in-domain training data to build speech recognition systems for under-resourced languages can be a costly, time-demanding and tedious process. In this work, we propose the use of machine translation to translate English transcripts of telephone speech into Czech language in order to improve a Czech CTS speech recognition system. The translated transcripts are used as additional language...

chapter

Improving acoustic based keyword spotting using LVCSR lattices

Petr Motlicek, Fabio Valente, Igor Szoke

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4413 - 4416

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper investigates detection of English keywords in a conversational scenario using a combination of acoustic and LVCSR based keyword spotting systems. Acoustic KWS systems search predefined words in parameterized spoken data. Corresponding confidences are represented by likelihood ratios given the keyword models and a background model. First, due to the especially high number of false-alarms,...

chapter

Combining missing-data reconstruction and uncertainty decoding for robust speech recognition

Jose A. Gonzalez, Antonio M. Peinado, Angel M. Gomez, Ning Ma, more

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4693 - 4696

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper proposes a novel approach for noise-robust speech recognition which combines a missing-data (MD) derived spectral reconstruction technique and uncertainty decoding based on the weighted Viterbi algorithm (WVA). First, the noisy feature vectors are compensated by using a novel MD imputation technique based on the integration of truncated Gaussian pdfs. Although the proposed MD estimator...

chapter

A layered approach for dutch large vocabulary continuous speech recognition

Joris Pelemans, Kris Demuynck, Patrick Wambacq

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4421 - 4424

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

In this paper we investigate whether a layered architecture that has already proven its value for small tasks, works for a system with large lexica (400k words) and language models (5-grams) as well. The architecture was designed to decouple phone and word recognition which allows for the integration of more complex linguistic components, especially at the sub-word level. It was tested on the Dutch...

chapter

Two-dimensional frame-and-feature weighted Viterbi decoding for robust speech recognition

Yang Chang, Lin-shan Lee

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4689 - 4692

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

In this paper we propose a new approach of two-dimensional frame-and-feature weighted Viterbi decoding performed at the recognizer back-end for robust speech recognition. A new SVM-based frame weighting approach is proposed considering the energy distribution and harmonicity of the frame. The feature weighting is based on a previously proposed approach using an entropy measure considering confusion...

chapter

Silence is golden: Modeling non-speech events in WFST-based dynamic network decoders

David Rybach, Ralf Schluter, Hermann Ney

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4205 - 4208

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

Models for silence are a fundamental part of continuous speech recognition systems. Depending on application requirements, audio data segmentation, and availability of detailed training data annotations, it may be necessary or beneficial to differentiate between other non-speech events, for example breath and background noise. The integration of multiple non-speech models in a WFST-based dynamic network...

chapter

Extended search space pruning in LVCSR

David Nolden, Ralf Schluter, Hermann Ney

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4429 - 4432

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

We compare the most important pruning methods which are common in different LVCSR decoding architectures and lead them back to a theoretical motivation. Based on this motivation, we propose a new pruning method which fades the word end pruning over a large part of the search network. We analyze the methods regarding their relationship between search-space and word error rate, and regarding their mutual...

chapter

System combination for out-of-vocabulary word detection

Long Qin, Ming Sun, Alexander Rudnicky

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4817 - 4820

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper presents a method to improve the out-of-vocabulary (OOV) word detection performance by combining multiple speech recognition systems' outputs. Three different fragment-word hybrid systems, the phone, subword, and graphone systems, were built for detecting OOV words. Then outputs from each individual system were combined using ROVER. Two combination metrics were explored in ROVER, voting...

chapter

Transition probabilities are more important than we once thought

Guoli Ye, Dongpeng Chen, Brian Mak

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4809 - 4812

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

It is generally believed that the transition probabilities in a hidden Markov model (HMM) have a limited role in the speech decoding process. In this paper, through a series of recognition experiments on Wall Street Journal (WSJ) read speech and SVitchboard (SVB) conversational telephone speech, we find that the HMM transition probabilities may be more important than we once thought. The experiments...

chapter

Complexity-aware adaptive jitter buffer with time-scaling

Liyun Pang, Anisse Taleb, Jianfeng Xu, Laszlo Boszormenyi

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4473 - 4476

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

In VoIP applications, packet loss, delay and delay jitter are inevitable and have a large impact on the perceived speech quality. Jitter buffers are commonly deployed to compensate for jitter in order to play out the received packets continuously. For mobile devices, due to limited battery power, computational complexity has to be kept to a minimum. In this paper, we propose a jitter buffer management...

chapter

Optimizing the parameters of decoding graphs using new log-based MCE

Abdelaziz A. Abdelhamid, Waleed H. Abdulla

Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference > 1 - 5

2012 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

This paper proposes a new class loss function as an alternative to the standard sigmoid class loss function for optimizing the parameters of decoding graphs using discriminative training based on minimum classification error (MCE) criterion. The standard sigmoid based approach tends to ignore a significant number of training samples that have a large difference between the scores of the reference...

chapter

Implementation of G.729A on Embedded SIMD Processor

Xiaoqiong Tan, Ruimin Hu, Weiping Tu, Haojun Ai

2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming > 181 - 184

2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming (PAAP)

This paper addresses a real-time implementation of multi-channel, high quality G.729A speech codec based on an embedded SIMD processor, which is used in a SIP Video Phone. A series of strategies are designed for the special characteristics of the processor and the G.729A, including the memory management and SIMD decomposing. The profile shows that the dramatic improvement is achieved. Less than 20%...

chapter

Informed source separation: Underdetermined source signal recovery from an instantaneous stereo mixture

Stanislaw Gorlow, Sylvain Marchand

2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 309 - 312

2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

The present paper exposes a new technique that aims at solving an ill-posed source separation problem encountered in stereo mixtures. The proposed method is realized in an encoder-decoder framework: On the encoder side, a set of spectral envelopes is extracted from the original tracks, which are known. These envelopes are passed on to the decoder in attachment to the stereo mixture, whereas the frequency...

chapter

Wideband speech coding using ADPCM and a new enhanced bandwidth extension method

Mohammad Ghaderi, Mohammad H. Savoji

2011 IEEE 7th International Symposium on Intelligent Signal Processing > 1 - 4

2011 IEEE 7th International Symposium on Intelligent Signal Processing - (WISP 2011)

An enhanced bandwidth extension scheme is introduced in this paper for wideband speech coding using ADPCM. The coded lower band signal plus small side information (some parameters) are transmitted instead of the whole band. In the decoder both frequency parts are reconstructed from the coded signal and the received parameters. In the proposed method, the high frequency part is derived from the excitation...

chapter

Performance optimization of the CVSD FH system based on DEP

Xiaogang Huang

2011 International Conference on Electronics, Communications and Control (ICECC) > 949 - 952

2011 International Conference on Electronics, Communications and Control (ICECC)

To solve the problem of the low speech signal quality on the FH channels with wideband rejective interference, the performance optimization of CVSD coding in the digital FH system was studied. A new CVSD demodulation arithmetic was designed based on the principle of traditional CVSD system. Then the rule for measuring the quality of speech signal was offered and presented the simulation method of...

chapter

Decoding semantic information from human electrocorticographic (ECoG) signals

Wei Wang, Alan D Degenhart, Gustavo P Sudre, Dean A Pomerleau, more

2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society > 6294 - 6298

2011 33rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society

This study examined the feasibility of decoding semantic information from human cortical activity. Four human subjects undergoing presurgical brain mapping and seizure foci localization participated in this study. Electrocorticographic (ECoG) signals were recorded while the subjects performed simple language tasks involving semantic information processing, such as a picture naming task where subjects...

chapter

Brain-machine interfaces for real-time speech synthesis

Frank H. Guenther, Jonathan S. Brumberg

2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society > 5360 - 5363

2011 33rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society

This paper reports on studies involving brain-machine interfaces (BMIs) that provide near-instantaneous audio feedback from a speech synthesizer to the BMI user. In one study, neural signals recorded by an intracranial electrode implanted in a speech-related region of the left precentral gyrus of a human volunteer suffering from locked-in syndrome were transmitted wirelessly across the scalp and used...

chapter

FEC-based packet loss recovery for AVS-M audio codec

Jianli Liu, Shenghui Zhao, Jing Wang, Jingming Kuang

2011 International Conference on Multimedia Technology > 3069 - 3072

2011 International Conference on Multimedia Technology (ICMT)

In this paper, we utilize sender-based Forward Error Correction (FEC) techniques to enhance the robustness of packet loss recovery for AVS Mobile speech and audio (AVS-M) codec. Two FEC schemes are proposed which take the advantage of the codec's structure characteristics and do not introduce extra delay. The objective and subjective listening tests results show that the two methods achieve higher...

chapter

Out of vocabulary detection in Indonesian speech recognition using word and syllable level decoding

Aswin Juari, Ayu Purwarianti

Proceedings of the 2011 International Conference on Electrical Engineering and Informatics > 1 - 8

2011 International Conference on Electrical Engineering and Informatics (ICEEI)

One of the problems in speech recognition is out of vocabulary words (OOV) because they can make some words error. Out of vocabulary words are the words that cannot be recognized by speech recognizer because there is no recognizing database. Alignment, language model, and POS Tag method is proposed in order to recognize word error because of OOV words. Word and syllable level decoding from speech...

1 ...
5
6
7
8
9
10
11

Keywords:
DECODING
SPEECH

Publication date

Set your own date range

INFONA - science communication portal

Search results

Efficient integration of translation and speech models in dictation based machine aided human translation

Improving language models for ASR using translated in-domain data

Improving acoustic based keyword spotting using LVCSR lattices

Combining missing-data reconstruction and uncertainty decoding for robust speech recognition

A layered approach for dutch large vocabulary continuous speech recognition

Two-dimensional frame-and-feature weighted Viterbi decoding for robust speech recognition

Silence is golden: Modeling non-speech events in WFST-based dynamic network decoders

Extended search space pruning in LVCSR

System combination for out-of-vocabulary word detection

Transition probabilities are more important than we once thought

Complexity-aware adaptive jitter buffer with time-scaling

Optimizing the parameters of decoding graphs using new log-based MCE

Implementation of G.729A on Embedded SIMD Processor

Informed source separation: Underdetermined source signal recovery from an instantaneous stereo mixture

Wideband speech coding using ADPCM and a new enhanced bandwidth extension method

Performance optimization of the CVSD FH system based on DEP

Decoding semantic information from human electrocorticographic (ECoG) signals

Brain-machine interfaces for real-time speech synthesis

FEC-based packet loss recovery for AVS-M audio codec

Out of vocabulary detection in Indonesian speech recognition using word and syllable level decoding

Filter options

Publication date

Content availability

Keywords

Data set

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Data set

Reporting an error / abuse

Sending the report failed

Accessibility options