Search results

Items from 1 to 9 out of 9 results

chapter

Random forest regression based acoustic event detection with bottleneck features

Xianjun Xia, Roberto Togneri, Ferdous Sohel, David Huang

2017 IEEE International Conference on Multimedia and Expo (ICME) > 157 - 162

2017 IEEE International Conference on Multimedia and Expo (ICME)

This paper deals with random forest regression based acoustic event detection (AED) by combining acoustic features with bottleneck features (BN). The bottleneck features have a good reputation of being inherently discriminative in acoustic signal processing. To deal with the unstructured and complex real-world acoustic events, an acoustic event detection system is constructed using bottleneck features...

chapter

A novel pitch extraction based on jointly trained deep BLSTM Recurrent Neural Networks with bottleneck features

Bin Liu, Jianhua Tao, Dawei Zhang, Yibin Zheng

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 336 - 340

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pitch is an important characteristic of speech and is useful for many applications. However, it is still challenging to estimate pitch in strong noise. In this paper, we propose a joint training approach to determinate pitch. First, a Bidirectional Long Short-Term Memory Recurrent Neural Networks (BLSTMRNN) is trained to map the noisy to clean speech features. Second, the pitch estimation is also...

chapter

Pairwise learning using multi-lingual bottleneck features for low-resource query-by-example spoken term detection

Yougen Yuan, Cheung-Chi Leung, Lei Xie, Hongjie Chen, more

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5645 - 5649

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We propose to use a feature representation obtained by pairwise learning in a low-resource language for query-by-example spoken term detection (QbE-STD). We assume that word pairs identified by humans are available in the low-resource target language. The word pairs are parameterized by a multi-lingual bottleneck feature (BNF) extractor that is trained using transcribed data in high-resource languages...

chapter

Alternative networks for monolingual bottleneck features

William Hartmann, Roger Hsiao, Stavros Tsakalidis

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5290 - 5294

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

While recent advances in deep neural networks have lead to significant improvements in speech recognition, they have been applied mainly to acoustic and language modeling. We instead apply the models to bottleneck feature extraction. Several DNN, CNN, and BLSTM-based bottleneck feature networks are compared using both DNN and BLSTM acoustic models. Multiple variations in network architecture and feature...

chapter

Approaches for language identification in mismatched environments

Shahan Nercessian, Pedro Torres-Carrasquillo, Gabriel Martinez-Montes

2016 IEEE Spoken Language Technology Workshop (SLT) > 335 - 340

2016 IEEE Spoken Language Technology Workshop (SLT)

In this paper, we consider the task of language identification in the context of mismatch conditions. Specifically, we address the issue of using unlabeled data in the domain of interest to improve the performance of a state-of-the-art system. The evaluation is performed on a 9-language set that includes data in both conversational telephone speech and narrowband broadcast speech. Multiple experiments...

chapter

Deep bottleneck features and sound-dependent i-vectors for simultaneous recognition of speech and environmental sounds

Sakriani Sakti, Seiji Kawanishi, Graham Neubig, Koichiro Yoshino, more

2016 IEEE Spoken Language Technology Workshop (SLT) > 35 - 42

2016 IEEE Spoken Language Technology Workshop (SLT)

In speech interfaces, it is often necessary to understand the overall auditory environment, not only recognizing what is being said, but also being aware of the location or actions surrounding the utterance. However, automatic speech recognition (ASR) becomes difficult when recognizing speech with environmental sounds. Standard solutions treat environmental sounds as noise, and remove them to improve...

chapter

Bottleneck features from SNR-adaptive denoising deep classifier for speaker identification

Zhili Tan, Man-Wai Mak

2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1035 - 1040

2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In this paper, we explore the potential of using deep learning for extracting speaker-dependent features for noise robust speaker identification. More specifically, an SNR-adaptive denoising classifier is constructed by stacking two layers of restricted Boltzmann machines (RBMs) on top of a denoising deep autoencoder, where the top-RBM layer is connected to a soft-max output layer that outputs the...

article

Deep Neural Network Approaches to Speaker and Language Recognition

Fred Richardson, Douglas Reynolds, Najim Dehak

IEEE Signal Processing Letters > 2015 > 22 > 10 > 1671 - 1675

The impressive gains in performance obtained using deep neural networks (DNNs) for automatic speech recognition (ASR) have motivated the application of DNNs to other speech technologies such as speaker recognition (SR) and language recognition (LR). Prior work has shown performance gains for separate SR and LR tasks using DNNs for direct classification or for feature extraction. In this work we present...

chapter

Learning feature mapping using deep neural network bottleneck features for distant large vocabulary speech recognition

Ivan Himawan, Petr Motlicek, David Imseng, Blaise Potard, more

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4540 - 4544

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Automatic speech recognition from distant microphones is a difficult task because recordings are affected by reverberation and background noise. First, the application of the deep neural network (DNN)/hidden Markov model (HMM) hybrid acoustic models for distant speech recognition task using AMI meeting corpus is investigated. This paper then proposes a feature transformation for removing reverberation...

Filter options

Keywords:
FEATURE EXTRACTION
BOTTLENECK FEATURES

Publication date

Set your own date range

Publication type

book (8)
article (1)

Keywords

SPEECH (7)
ACOUSTICS (5)
SPEECH RECOGNITION (4)
HIDDEN MARKOV MODELS (3)
NEURAL NETWORKS (3)
NOISE MEASUREMENT (3)
ADAPTATION MODELS (2)
CONTEXT (2)
DEEP NEURAL NETWORK (2)
ACOUSTIC EVENT DETECTION (1)
ACOUSTIC FEATURES (1)
AMI CORPUS (1)
ARTIFICIAL NEURAL NETWORKS (1)
AUDITORY SYSTEM (1)
AUTOENCODER (1)
BABEL (1)
BACKPROPAGATION (1)
BLSTM-RNN (1)
DATA MINING (1)
DEEP BELIEF NETWORKS (1)
DEEP LEARNING (1)
DEEP NEURAL NETWORKS (1)
DENOISING AUTOENCODER (1)
DISTANT SPEECH RECOGNITION (1)
DNN (1)
DOMAIN ADAPTATION (1)
DOMAIN MISMATCH (1)
ESTIMATION (1)
EVENT DETECTION (1)
FEATURE MAPPING (1)
I-VECTOR (1)
INDEXES (1)
JOINT TRAINING (1)
LABELING (1)
LANGUAGE IDENTIFICATION (1)
LANGUAGE RECOGNITION (1)
LOW-RESOURCE SPEECH PROCESSING (1)
MEETINGS (1)
MEL FREQUENCY CEPSTRAL COEFFICIENT (1)
NOISE REDUCTION (1)
PAIRWISE LEARNING (1)
PITCH ESTIMATION (1)
RANDOM FOREST REGRESSION (1)
SENONE POSTERIORS (1)
SIGNAL TO NOISE RATIO (1)
SIMULTANEOUS RECOGNITION OF SPEECH AND ENVIRONMENTAL SOUNDS (1)
SOUND-DEPENDENT I-VECTOR (1)
SPEAKER IDENTIFICATION (1)
SPEAKER RECOGNITION (1)
SPOKEN TERM DETECTION (1)
SWITCHES (1)
SYSTEM PERFORMANCE (1)
TANDEM FEATURES (1)
TRAINING DATA (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options