Search results

Items from 101 to 120 out of 1,337 results

1 ...
3
4
5
6
7
8
9

chapter

Similarity induced group sparsity for non-negative matrix factorisation

Antti Hurmalainen, Rahim Saeidi, Tuomas Virtanen

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4425 - 4429

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Non-negative matrix factorisations are used in several branches of signal processing and data analysis for separation and classification. Sparsity constraints are commonly set on the model to promote discovery of a small number of dominant patterns. In group sparse models, atoms considered to belong to a consistent group are permitted to activate together, while activations across groups are suppressed,...

chapter

A comparative study of spectral clustering for i-vector-based speaker clustering under noisy conditions

Naohiro Tawara, Tetsuji Ogawa, Tetsunori Kobayashi

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2041 - 2045

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The present paper dealt with speaker clustering for speech corrupted by noise. In general, the performance of speaker clustering significantly depends on how well the similarities between speech utterances can be measured. The recently proposed i-vector-based cosine similarity has yielded the state-of-the-art performance in speaker clustering systems. However, this similarity often fails to capture...

chapter

Combining sparse NMF with deep neural network: A new classification-based approach for speech enhancement

Hung-Wei Tseng, Mingyi Hong, Zhi-Quan Luo

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2145 - 2149

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this work, we consider enhancing a target speech from a singlechannel noisy observation corrupted by non-stationary noises at low signal-to-noise ratios (SNRs). We take a classification-based approach, where the objective is to estimate an Ideal Binary Mask (IBM) that classifies each time-frequency (T-F) unit of the noisy observation into one of the two categories: speech-dominant unit or noise-dominant...

chapter

Multi-channel speaker localization and separation using a model-based GSC and an inertial measurement unit

Mehdi Zohourian, Alan Archer-Boyd, Rainer Martin

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5615 - 5619

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper we propose a novel multi-channel algorithm to separate simultaneous speakers in an environment where the microphone array is subject to movement. When the microphones are mounted to a person's head, for instance, the movements can lead to ambiguities with respect to the sources and to distortions in the processed signal. The proposed system estimates the direction-of-arrival of the speaker's...

chapter

Query-by-example keyword spotting using long short-term memory networks

Guoguo Chen, Carolina Parada, Tara N. Sainath

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5236 - 5240

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We present a novel approach to query-by-example keyword spotting (KWS) using a long short-term memory (LSTM) recurrent neural network-based feature extractor. In our approach, we represent each keyword using a fixed-length feature vector obtained by running the keyword audio through a word-based LSTM acoustic model. We use the activations prior to the softmax layer of the LSTM as our keyword-vector...

chapter

Frequency-domain Comfort Noise Generation for Discontinuous Transmission in EVS

Anthony Lombard, Stephan Wilde, Emmanuel Ravelli, Stefan Dohla, more

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5893 - 5897

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Discontinuous Transmission (DTX) is an efficient way to drastically reduce the transmission rate of a communication codec in the absence of voice input. In this mode, most frames that are determined to consist of background noise only are dropped from transmission and replaced by some Comfort Noise Generation (CNG) in the decoder. In this paper, we propose a novel CNG approach combining information...

chapter

A hearing model to estimate mandarin speech intelligibility for the hearing impaired patients

Pei-Chun Tsai, Shih-Ting Lin, Wen-Chung Lee, Chung-Chien Hsu, more

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5848 - 5852

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

A hearing model, which is parameterized by hearing thresholds, degrees of loudness recruitment and reductions of frequency resolution of a hearing-impaired (HI) patient, is proposed in this paper. The model is developed in the filter-bank framework and is flexible for fitting hearing-loss conditions of HI patients. Psychoacoustic experiments were conducted under clean and noisy conditions to validate...

chapter

Perceptual effect of reverberation on multi-microphone noise reduction for cochlear implants

Adam A. Hersbach, David B. Grayden, James B. Fallon, Hugh J. McDermott

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5853 - 5857

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The combination of noise and reverberation make listening conditions difficult for cochlear implant (CI) users. The perceptual effect of reverberation was evaluated via speech intelligibility tests with CI users. A fixed directional microphone, an adaptive directional microphone and a beamformer post-filter were evaluated. Reverberation was varied by changing the target and noise distance and by simulating...

chapter

Speech dereverberation using a learned speech model

Dawen Liang, Matthew D. Hoffman, Gautham J. Mysore

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 1871 - 1875

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We present a general single-channel speech dereverberation method based on an explicit generative model of reverberant and noisy speech. To regularize the model, we use a pre-learned speech model of clean and dry speech as a prior and perform posterior inference over the latent clean speech. The reverberation kernel and additive noise are estimated under the maximum-likelihood framework. Our model...

chapter

Noise robust integration for blind and non-blind reverberation time estimation

Christian Schuldt, Peter Handel

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 56 - 60

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The estimation of the decay rate of a signal section is an integral component of both blind and non-blind reverberation time estimation methods. Several decay rate estimators have previously been proposed, based on, e.g., linear regression and maximum-likelihood estimation. Unfortunately, most approaches are sensitive to background noise, and/or are fairly demanding in terms of computational complexity...

chapter

A novel static parameter calculation method for model compensation

Suliang Bu, Yunxin Zhao, Yanmin Qian, Kai Yu

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4510 - 4514

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Vector Taylor Series (VTS) based model compensation approach has been successfully applied to various robust speech recognition tasks. In this paper, we propose a novel method of variable transformation to calculate the static statistics. In addition, we provide a detailed explanation of VTS and random variable transformations adopted in some recent papers. Experiments on Aurora 4 showed that the...

chapter

Dereverberation sweet spot dilation with combined channel equalization and beamforming

M. R. P. Thomas, H. Gamper, I. J. Tashev

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 748 - 752

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Beamforming and channel equalizers can be formulated as optimal multichannel filter-and-sum operations with different objective criteria. It has been shown in previous studies that the combination of both concepts under a common framework can yield results that combine both the spatial robustness of beamforming and the dereverberation performance of channel equalization. This paper introduces an additional...

chapter

Micbots: Collecting large realistic datasets for speech and audio research using mobile robots

Jonathan Le Roux, Emmanuel Vincent, John R. Hershey, Daniel P.W. Ellis

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5635 - 5639

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Speech and audio signal processing research is a tale of data collection efforts and evaluation campaigns. Large benchmark datasets for automatic speech recognition (ASR) have been instrumental in the advancement of speech recognition technologies. However, when it comes to robust ASR, source separation, and localization, especially using microphone arrays, the perfect dataset is out of reach, and...

chapter

Weighted training for speech under Lombard Effect for speaker recognition

Muhammad Muneeb Saleem, Gang Liu, John H.L. Hansen

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4350 - 4354

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The presence of Lombard Effect in speech is proven to have severe effects on the performance of speech systems, especially speaker recognition. Varying kinds of Lombard speech are produced by speakers under influence of varying noise types [1]. This study proposes a high-accuracy classifier using deep neural networks for detecting various kinds of Lombard speech against neutral speech, independent...

chapter

Continuous visual speech recognition for audio speech enhancement

Eric Benhaim, Hichem Sahbi, Guillaume Vittey

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2244 - 2248

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We introduce in this paper a novel non-blind speech enhancement procedure based on visual speech recognition (VSR). The latter is based on a generative process that analyzes sequences of talking faces and classifies them into visual speech units known as visemes. We use an effective graphical model able to segment and label a given sequence of talking faces into a sequence of visemes. Our model captures...

chapter

Detection and suppression of keyboard transient noise in audio streams with auxiliary keybed microphone

Simon Godsill, Herbert Buchner, Jan Skoglund

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 379 - 383

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper a problem in transient noise suppression for audio streams in laptop and netbook devices is addressed. One or more microphones record voice signals which are corrupted with ambient noise and also transient noise from keyboard and mouse clicks. In the current work, a synchronous reference microphone is embedded in the keyboard which allows for measurement of the key click noise, substantially...

chapter

Unsupervised feature learning for urban sound classification

Justin Salamon, Juan Pablo Bello

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 171 - 175

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Recent studies have demonstrated the potential of unsupervised feature learning for sound classification. In this paper we further explore the application of the spherical k-means algorithm for feature learning from audio signals, here in the domain of urban sound classification. Spherical k-means is a relatively simple technique that has recently been shown to be competitive with other more complex...

chapter

Nested generalized sidelobe canceller for joint dereverberation and noise reduction

Ofer Schwartz, Sharon Gannot, Emanuel A. P. Habets

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 106 - 110

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Speech signal is often contaminated by both room reverberation and ambient noise. In this contribution, we propose a nested generalized sidelobe canceller (GSC) beamforming structure, comprising an inner and an outer GSC beamformers (BFs), that decouple the speech dereverberation and the noise reduction operations. The BFs are implemented in the short-time Fourier transform (STFT) domain. Two alternative...

chapter

Direct-to-Reverberant Ratio estimation using a null-steered beamformer

James Eaton, Alastair H. Moore, Patrick A. Naylor, Jan Skoglund

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 46 - 50

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Reverberation affects the quality and intelligibility of distant speech recorded in a room. Direct-to-Reverberant Ratio (DRR) is a useful measure for assessing the acoustic configuration and can be used to inform dereverberation algorithms. We describe a novel DRR estimation algorithm applicable where the signal was recorded with two or more microphones, such as mobile communications devices and laptops...

chapter

Deep NMF for speech separation

Jonathan Le Roux, John R. Hershey, Felix Weninger

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 66 - 70

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Non-negative matrix factorization (NMF) has been widely used for challenging single-channel audio source separation tasks. However, inference in NMF-based models relies on iterative inference methods, typically formulated as multiplicative updates. We propose “deep NMF”, a novel non-negative deep network architecture which results from unfolding the NMF iterations and untying its parameters. This...

1 ...
3
4
5
6
7
8
9

Keywords:
NOISE
SPEECH

Publication date

Set your own date range

Content availability

Available (1,323)
None (14)

Keywords

NOISE MEASUREMENT (444)
SPEECH RECOGNITION (401)
SPEECH ENHANCEMENT (391)
SPEECH PROCESSING (318)
MICROPHONES (220)
FEATURE EXTRACTION (206)
ESTIMATION (199)
SIGNAL TO NOISE RATIO (176)
HIDDEN MARKOV MODELS (172)
ACOUSTICS (148)
NOISE REDUCTION (147)
ROBUSTNESS (137)
TRAINING (121)
MEL FREQUENCY CEPSTRAL COEFFICIENT (115)
ACCURACY (98)
SIGNAL PROCESSING ALGORITHMS (96)
CORRELATION (89)
SIGNAL DENOISING (85)
DATABASES (84)
ALGORITHM DESIGN AND ANALYSIS (79)
SIGNAL PROCESSING (77)
HARMONIC ANALYSIS (73)
SPEAKER RECOGNITION (72)
ARRAYS (71)
SPEECH CODING (66)
ADAPTIVE FILTERS (64)
REVERBERATION (64)
ARRAY SIGNAL PROCESSING (62)
CEPSTRAL ANALYSIS (62)
MATHEMATICAL MODEL (62)
TRANSFORMS (62)
TIME FREQUENCY ANALYSIS (60)
DATA MINING (59)
WAVELET TRANSFORMS (59)
EQUATIONS (55)
ACOUSTIC SIGNAL PROCESSING (53)
VECTORS (53)
AUDITORY SYSTEM (52)
FILTERING THEORY (52)
MICROPHONE ARRAYS (52)
ARTIFICIAL NEURAL NETWORKS (51)
SPECTRAL ANALYSIS (50)
INTERFERENCE SUPPRESSION (49)
FREQUENCY DOMAIN ANALYSIS (46)
ROBOTS (45)
COMPUTATIONAL MODELING (44)
AUTOMATIC SPEECH RECOGNITION (43)
BLIND SOURCE SEPARATION (43)
GAIN (43)
SPEECH INTELLIGIBILITY (42)
SOURCE SEPARATION (41)
FILTERING (40)
SPECTRAL SUBTRACTION (40)
INDEXES (39)
MAXIMUM LIKELIHOOD ESTIMATION (39)
SPEECH SIGNAL (38)
WIENER FILTERS (38)
ROBUST SPEECH RECOGNITION (37)
SPEECH SYNTHESIS (37)
VOICE ACTIVITY DETECTION (37)
WIENER FILTER (37)
ADDITIVE NOISE (36)
FREQUENCY ESTIMATION (36)
HEARING (36)
ADAPTATION MODEL (35)
CLASSIFICATION ALGORITHMS (35)
EDUCATIONAL INSTITUTIONS (35)
AUDIO SIGNAL PROCESSING (34)
DISTORTION (34)
ACOUSTIC NOISE (33)
CONFERENCES (33)
DISCRETE COSINE TRANSFORMS (33)
DISCRETE FOURIER TRANSFORMS (33)
SPECTROGRAM (33)
WHITE NOISE (33)
REAL TIME SYSTEMS (32)
INDEPENDENT COMPONENT ANALYSIS (31)
INTERFERENCE (30)
MUSIC (30)
TIME-FREQUENCY ANALYSIS (30)
COMPLEXITY THEORY (29)
DETECTORS (29)
ENCODING (29)
SIGNAL CLASSIFICATION (29)
COMPUTERS (28)
ELECTRONIC MAIL (28)
FILTERING ALGORITHMS (28)
FREQUENCY MODULATION (28)
BANDWIDTH (27)
LEAST MEAN SQUARES METHODS (27)
MICROPHONE ARRAY (27)
POWER HARMONIC FILTERS (27)
SUPPORT VECTOR MACHINES (27)
APPROXIMATION METHODS (26)
BACKGROUND NOISE (26)
DELAY (26)
MFCC (26)
MODULATION (26)
more

INFONA - science communication portal

Search results

Similarity induced group sparsity for non-negative matrix factorisation

A comparative study of spectral clustering for i-vector-based speaker clustering under noisy conditions

Combining sparse NMF with deep neural network: A new classification-based approach for speech enhancement

Multi-channel speaker localization and separation using a model-based GSC and an inertial measurement unit

Query-by-example keyword spotting using long short-term memory networks

Frequency-domain Comfort Noise Generation for Discontinuous Transmission in EVS

A hearing model to estimate mandarin speech intelligibility for the hearing impaired patients

Perceptual effect of reverberation on multi-microphone noise reduction for cochlear implants

Speech dereverberation using a learned speech model

Noise robust integration for blind and non-blind reverberation time estimation

A novel static parameter calculation method for model compensation

Dereverberation sweet spot dilation with combined channel equalization and beamforming

Micbots: Collecting large realistic datasets for speech and audio research using mobile robots

Weighted training for speech under Lombard Effect for speaker recognition

Continuous visual speech recognition for audio speech enhancement

Detection and suppression of keyboard transient noise in audio streams with auxiliary keybed microphone

Unsupervised feature learning for urban sound classification

Nested generalized sidelobe canceller for joint dereverberation and noise reduction

Direct-to-Reverberant Ratio estimation using a null-steered beamformer

Deep NMF for speech separation

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options