Wyniki wyszukiwania

rozdział

Quadratic classifier with sliding training data set in robust recursive identification of non-stationary AR model of speech

Milan Markovic

1996 8th European Signal Processing Conference (EUSIPCO 1996) > 1 - 4

1996 8th European Signal Processing Conference (EUSIPCO 1996)

In this work, a robust recursive procedure based on WRLS algorithm with VFF and a quadratic classifier with sliding training data set for identification of non-stationary AR model of speech production system is proposed. Experimental analysis is done according to the results obtained in analyzing speech signal with voiced and mixed excitation segments. Presented experimental results justify that two...

rozdział

Comparison of several preprocessing techniques for robust speech recognition over both PSN and GSM networks

Chafic Mokbel, Laurent Mauuary, Denis Jouvet, Jean Monne

1996 8th European Signal Processing Conference (EUSIPCO 1996) > 1 - 4

1996 8th European Signal Processing Conference (EUSIPCO 1996)

In this paper several preprocessing techniques used to improve speech recognition performance are compared over both PSN and GSM networks. Recognition experiments are conducted on a digit database in a speaker-independent isolated-word mode in order to evaluate the performances under within- and cross-network (PSN and GSM) conditions. Two classes of preprocessing techniques are distinguished depending...

rozdział

Robust speech recognition using fuzzy matrix quantisation, neural networks and Hidden Markov models

C S Xydeas, Lin Cong

1996 8th European Signal Processing Conference (EUSIPCO 1996) > 1 - 4

1996 8th European Signal Processing Conference (EUSIPCO 1996)

In this paper a new approach to robust speech recognition using Fuzzy Matrix Quantisation, Hidden Markov Models and Neural Networks is presented and tested when speech is corrupted by car noise. Thus two new robust isolated word speech recognition (IWSR) systems called FMQ/HMM and FMQ/MLP, are proposed and designed optimally for operation in a variety of input SNR conditions. The schemes and associated...

rozdział

A novel hybrid audio steganography for imperceptible data hiding

Gowtham Prasad TVS, S Varadarajan

2015 International Conference on Communications and Signal Processing (ICCSP) > 634 - 638

2015 International Conference on Communications and Signal Processing (ICCSP)

In recent years, extensive research has been taken for hiding data into digital audio signal because of advantages of psycho acoustical masking phenomenon of human auditory system [HAS]. This paper presents a novel method based on audio steganography by integrating optimal steganography and two level cryptographic methods. Improvement of imperceptibility of data hiding and increased security level...

rozdział

Trinicon-BSS system incorporating robust dual beamformers for noise reduction

Craig A. Anderson, Stefan Meier, Walter Kellermann, Paul D. Teal, więcej

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 529 - 533

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper, a method of adaptive noise suppression combining spatially robust fixed beamforming and the TRINICON blind source separation algorithm is presented. A multichannel sensor array is first processed using complementary fixed beamformers into maximum and minimum SINR channels. The channels form the inputs to a single 2×2 second-order statistics TRINICON-BSS system which adaptively compensates...

rozdział

Deep neural networks for cochannel speaker identification

Xiaojia Zhao, Yuxuan Wang, DeLiang Wang

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4824 - 4828

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Speaker identification (SID) in cochannel speech, where two speakers are talking simultaneously over a single recording channel, is a challenging problem. Previous studies address this problem in the anechoic environment under the Gaussian mixture model (GMM) framework. On the other hand, cochannel SID in reverberant conditions has not been addressed. This paper studies cochannel SID in both anechoic...

rozdział

Robust estimation of structured covariance matrix for heavy-tailed distributions

Ying Sun, Prabhu Babu, Daniel P. Palomar

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5693 - 5697

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper, we consider the robust covariance estimation problem in the non-Gaussian set-up. In particular, Tyler's M-estimator is adopted for samples drawn from a heavy-tailed elliptical distribution. For some applications, the covariance matrix naturally possesses certain structure. Therefore, incorporating the prior structure information in the estimation procedure is beneficial to improving...

rozdział

Cross-corpus depression prediction from speech

Vikramjit Mitra, Elizabeth Shriberg, Dimitra Vergyri, Bruce Knoth, więcej

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4769 - 4773

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Research on detecting depression from speech has advanced in recent years, but most work has focused on the analysis of one corpus at a time. Given that clinical corpora are typically small, it is important to explore approaches that generalize across corpora and that could ultimately be adapted to new data. We study a new corpus of patient-clinician interactions recorded when patients are admitted...

rozdział

Robust localisation of multiple speakers exploiting head movements and multi-conditional training of binaural cues

Tobias May, Ning Ma, Guy J. Brown

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2679 - 2683

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper addresses the problem of localising multiple competing speakers in the presence of room reverberation, where sound sources can be positioned at any azimuth on the horizontal plane. To reduce the amount of front-back confusions which can occur due to the similarity of interaural time differences (ITDs) and interaural level differences (ILDs) in the front and rear hemifield, a machine hearing...

rozdział

Robust sound event recognition using convolutional neural networks

Haomin Zhang, Ian McLoughlin, Yan Song

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 559 - 563

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Traditional sound event recognition methods based on informative front end features such as MFCC, with back end sequencing methods such as HMM, tend to perform poorly in the presence of interfering acoustic noise. Since noise corruption may be unavoidable in practical situations, it is important to develop more robust features and classifiers. Recent advances in this field use powerful machine learning...

rozdział

Robust excitation-based features for Automatic Speech Recognition

Thomas Drugman, Yannis Stylianou, Langzhou Chen, Xie Chen, więcej

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4664 - 4668

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper we investigate the use of noise-robust features characterizing the speech excitation signal as complementary features to the usually considered vocal tract based features for Automatic Speech Recognition (ASR). The proposed Excitation-based Features (EBF) are tested in a state-of-the-art Deep Neural Network (DNN) based hybrid acoustic model for speech recognition. The suggested excitation...

rozdział

A robust region-based near-field beamformer

Jorge Martinez, Nikolay Gaubitch, W. Bastiaan Kleijn

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2494 - 2498

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper, a broadband region-based near-field beamforming algorithm is proposed and demonstrated for acoustic applications. We use an eigenfilter structure with a minimum-energy cost function based on desired and undesired near-field regions. Robustness is thus achieved by focusing on signals generated from desired zones in space while rejecting signals from undesired zones. This construction...

rozdział

Robust speech processing using ARMA spectrogram models

Sriram Ganapathy

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5029 - 5033

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Speech applications in noisy and degraded channel conditions continue to be a challenging problem especially when there is a mismatch between the training and test conditions. In this paper, a robust speech feature extraction scheme is developed based on autoregressive moving average (ARMA) modeling that emphasizes high energy regions of the signal with a data driven modulation filter. The peak preserving...

rozdział

L_p-norm non-negative matrix factorization and its application to singing voice enhancement

Tomohiko Nakamuray, Hirokazu Kameoka

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2115 - 2119

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Measures of sparsity are useful in many aspects of audio signal processing including speech enhancement, audio coding and singing voice enhancement, and the well-known method for these applications is non-negative matrix factorization (NMF), which decomposes a non-negative data matrix into two non-negative matrices. Although previous studies on NMF have focused on the sparsity of the two matrices,...

rozdział

Efficient audio declipping using regularized least squares

Mark J. Harvilla, Richard M. Stern

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 221 - 225

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

While many recently proposed audio declipping algorithms are highly effective in their ability to restore clipped speech, the algorithms' computational complexities inhibit their use in many practical situations. Real-time or nearly real-time performance is impossible using a typical laptop computer, with some algorithms taking as long as 400 times the actual duration of the input to complete restoration...

rozdział

Efficient spectrogram-based binary image feature for audio copy detection

Chahid Ouali, Pierre Dumouchel, Vishwa Gupta

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 1792 - 1796

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper presents the latest improvements on our Spectro system that detects transformed duplicate audio content. We propose a new binary image feature derived from a spectrogram matrix by using a threshold based on the average of the spectral values. We quantize this binary image by applying a tile of fixed size and computing the sum of each small square in the tile. Fingerprints of each binary...

rozdział

Multimodal arousal rating using unsupervised fusion technique

Wei-Chen Chen, Po-Tsun Lai, Yu Tsao, Chi-Chun Lee

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5296 - 5300

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Arousal is essential in understanding human behavior and decision-making. In this work, we present a multimodal arousal rating framework that incorporates minimal set of vocal and non-verbal behavior descriptors. The rating framework and fusion techniques are unsupervised in nature to ensure that it can be readily-applicable and interpretable. Our proposed multimodal framework improves correlation...

rozdział

Reduced vowel space is a robust indicator of psychological distress: A cross-corpus analysis

Stefan Scherer, Louis-Philippe Morency, Jonathan Gratch, John Pestian

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4789 - 4793

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Reduced frequency range in vowel production is a well documented speech characteristic of individuals' with psychological and neurological disorders. Depression is known to influence motor control and in particular speech production. The assessment and documentation of reduced vowel space and associated perceived hypoarticulation and reduced expressivity often rely on subjective assessments. Within...

rozdział

On using heterogeneous data for vehicle-based speech recognition: A DNN-based approach

Xue Feng, Brigitte Richardson, Scott Amman, James Glass

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4385 - 4389

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Most automatic speech recognition (ASR) systems incorporate a single source of information about their input, namely, features and transformations derived from the speech signal. However, in many applications, e.g., vehicle-based speech recognition, sensor data and environmental information are often available to complement audio information. In this paper, we show how these data can be used to improve...

rozdział

Sequence-discriminative training of recurrent neural networks

Paul Voigtlaender, Patrick Doetsch, Simon Wiesler, Ralf Schluter, więcej

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2100 - 2104

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We investigate sequence-discriminative training of long shortterm memory recurrent neural networks using the maximum mutual information criterion. We show that although recurrent neural networks already make use of the whole observation sequence and are able to incorporate more contextual information than feed forward networks, their performance can be improved with sequence-discriminative training...

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania

Quadratic classifier with sliding training data set in robust recursive identification of non-stationary AR model of speech

Comparison of several preprocessing techniques for robust speech recognition over both PSN and GSM networks

Robust speech recognition using fuzzy matrix quantisation, neural networks and Hidden Markov models

A novel hybrid audio steganography for imperceptible data hiding

Trinicon-BSS system incorporating robust dual beamformers for noise reduction

Deep neural networks for cochannel speaker identification

Robust estimation of structured covariance matrix for heavy-tailed distributions

Cross-corpus depression prediction from speech

Robust localisation of multiple speakers exploiting head movements and multi-conditional training of binaural cues

Robust sound event recognition using convolutional neural networks

Robust excitation-based features for Automatic Speech Recognition

A robust region-based near-field beamformer

Robust speech processing using ARMA spectrogram models

L_p-norm non-negative matrix factorization and its application to singing voice enhancement

Efficient audio declipping using regularized least squares

Efficient spectrogram-based binary image feature for audio copy detection

Multimodal arousal rating using unsupervised fusion technique

Reduced vowel space is a robust indicator of psychological distress: A cross-corpus analysis

On using heterogeneous data for vehicle-based speech recognition: A DNN-based approach

Sequence-discriminative training of recurrent neural networks

Opcje filtrowania

Data publikacji

Dostępność treści

Słowa kluczowe

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania

Dodaj adresata

Anulowanie wysłania wiadomości

Czy na pewno chcesz anulować wysłanie wiadomości?

Wyślij wiadomość

Opcje filtrowania

Data publikacji

Ustawianie zakresu dat

Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.

Dostępność treści

Słowa kluczowe

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu