Search results

Items from 1 to 6 out of 6 results

chapter

Does speech enhancement work with end-to-end ASR objectives?: Experimental analysis of multichannel end-to-end ASR

Tsubasa Ochiai, Shinji Watanabe, Shigeru Katagiri

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) > 1 - 6

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)

Recently we proposed a novel multichannel end-to-end speech recognition architecture that integrates the components of multichannel speech enhancement and speech recognition into a single neural-network-based architecture and demonstrated its fundamental utility for automatic speech recognition (ASR). However, the behavior of the proposed integrated system remains insufficiently clarified. An open...

chapter

Deep bottleneck features and sound-dependent i-vectors for simultaneous recognition of speech and environmental sounds

Sakriani Sakti, Seiji Kawanishi, Graham Neubig, Koichiro Yoshino, more

2016 IEEE Spoken Language Technology Workshop (SLT) > 35 - 42

2016 IEEE Spoken Language Technology Workshop (SLT)

In speech interfaces, it is often necessary to understand the overall auditory environment, not only recognizing what is being said, but also being aware of the location or actions surrounding the utterance. However, automatic speech recognition (ASR) becomes difficult when recognizing speech with environmental sounds. Standard solutions treat environmental sounds as noise, and remove them to improve...

chapter

Mapping Mel sub-band energies using Deep belief network for robust speech recognition

Mojtaba Gholamipour, Babak Nasersharif

2016 8th International Symposium on Telecommunications (IST) > 510 - 514

2016 8th International Symposium on Telecommunications (IST)

Sub-band speech processing is well-known in robust speech recognition. On the other hand, in recent years, deep neural networks (DNNs) have been widely used in speech recognition for acoustic modeling and also feature extraction and transformation. In this paper, we propose to use deep belief network (DBN) as a post-processing method for de-noising in Mel sub-band level where we enhance logarithm...

chapter

Noise-robust detection of whispering in telephone calls using deep neural networks

Aleksandr Diment, Mikko Parviainen, Tuomas Virtanen, Roman Zelov, more

2016 24th European Signal Processing Conference (EUSIPCO) > 2310 - 2314

2016 24th European Signal Processing Conference (EUSIPCO)

Detection of whispered speech in the presence of high levels of background noise has applications in fraudulent behaviour recognition. For instance, it can serve as an indicator of possible insider trading. We propose a deep neural network (DNN)-based whispering detection system, which operates on both magnitude and phase features, including the group delay feature from all-pole models (APGD). We...

chapter

Weighted Combination of Naive Bayes and LVQ Classifier for Fongbe Phoneme Classification

Frejus A.A. Laleye, Eugene C. Ezin, Cina Motamed

2014 Tenth International Conference on Signal-Image Technology and Internet-Based Systems > 7 - 13

2014 Tenth International Conference on Signal-Image Technology & Internet-Based Systems (SITIS)

In speech recognition, phoneme classification has recently gained increased attention. The combination of classifiers has emerged as a reliable method and is used for decision-making by combining individual opinions to produce a final decision. In this study, we propose a novel classifier based on the combination of Naive Bayes and Learning Vector Quantization (LVQ) using weighted voting to recognize...

chapter

GIF-SP: GA-based informative feature for noisy speech recognition

Satoshi Tamura, Yoji Tagami, Satoru Hayamizu

Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference > 1 - 4

2012 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

This paper proposes a novel discriminative feature extraction method. The method consists of two stages; in the first stage, a classifier is built for each class, which categorizes an input vector into a certain class or not. From all the parameters of the classifiers, a first transformation can be formed. In the second stage, another transformation that generates a feature vector is subsequently...

Filter options

Keywords:
FEATURE EXTRACTION
SPEECH RECOGNITION
NOISE MEASUREMENT
TRAINING

Publication date

Set your own date range

Keywords

SPEECH (5)
HIDDEN MARKOV MODELS (2)
ACCURACY (1)
AUDITORY SYSTEM (1)
BOTTLENECK FEATURES (1)
DBN (1)
DECISION COMBINATION (1)
ENCODER-DECODER NETWORK (1)
FONGBE (1)
LMFB (1)
MEL FREQUENCY CEPSTRAL COEFFICIENT (1)
MULTICHANNEL END-TO-END AUTOMATIC SPEECH RECOGNITION (1)
NEURAL BEAMFORMER (1)
NEURAL NETWORKS (1)
NOISE REDUCTION (1)
PHONEME CLASSIFICATION (1)
ROBUSTNESS (1)
SIGNAL PROCESSING (1)
SIMULTANEOUS RECOGNITION OF SPEECH AND ENVIRONMENTAL SOUNDS (1)
SOUND-DEPENDENT I-VECTOR (1)
SPEECH ENHANCEMENT (1)
TANDEM FEATURES (1)
TESTING (1)
TIME-FREQUENCY ANALYSIS (1)
VECTORS (1)
WEIGHTED VOTING (1)
more

INFONA - science communication portal

Search results

Does speech enhancement work with end-to-end ASR objectives?: Experimental analysis of multichannel end-to-end ASR

Deep bottleneck features and sound-dependent i-vectors for simultaneous recognition of speech and environmental sounds

Mapping Mel sub-band energies using Deep belief network for robust speech recognition

Noise-robust detection of whispering in telephone calls using deep neural networks

Weighted Combination of Naive Bayes and LVQ Classifier for Fongbe Phoneme Classification

GIF-SP: GA-based informative feature for noisy speech recognition

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options