Search results

chapter

Speech enhancement using new iterative minimum statistics approach

Tahmina Gouhar, Nabih Jaber, Pallavi Kuntumalla

2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE) > 1 - 4

2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE)

In hands-free mobile communication, speech quality is often degraded due to presence of surrounding noise. This paper introduces an improved version of Minimum Mean Square Error (MMSE) noise estimator. Noise spectrum estimation is a crucial element used in speech recognition systems. Our proposed noise estimation method is based on a popular searching algorithm used in software engineering called...

chapter

Pitch prediction from Mel-frequency cepstral coefficients using sparse spectrum recovery

M V Achuth Rao, Prasanta Kumar Ghosh

2017 Twenty-third National Conference on Communications (NCC) > 1 - 6

2017 Twenty-third National Conference on Communications (NCC)

This work proposes a technique for predicting the pitch from Mel-frequency cepstral coefficients (MFCC) vectors. Previous pitch prediction methods are based on the statistical models such as Gaussian mixture models and hidden Markov models. In this paper, we propose a three-step method to estimate pitch from MFCC vectors. First the Mel-filterbank energies (MFBEs) are estimated from MFCC vectors. Secondly,...

chapter

Real-time distributed speech enhancement with two collaborating microphone arrays

Amin Hassani, Alexander Bertrand, Marc Moonen

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 6586 - 6587

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this demonstration, we aim at presenting our recent implementation results and provide an evaluation testbed through which users can experiment and compare the outputs of the distributed speech enhancement algorithms in [1–3]. The system allows a user to assess the merits of these algorithms in any acoustic setup. The multi-channel Wiener filter (MWF) is a well-known noise reduction algorithm for...

chapter

Real-time implementation of hearing aid with combined noise and acoustic feedback reduction based on smartphone

Maxim Vashkevich, Elias Azarov, Nick Petrovsky, Denis Likhachov, more

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 6570 - 6571

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The demonstration presents a real-time mockup of smartphone- based hearing aid with combined noise and acoustic feedback reduction. The designed reduction algorithm is based on spectral weighting approach which makes it very robust to rapid changes in feedback path either caused by displacement of the speaker/microphone or room acoustics. The aim of the demonstration is to show potential of the implemented...

chapter

Realtime binaural speech enhancement demo on raspberry Pi

Masoumeh Azarpour, Jan Siska, Gerald Enzner

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 6572 - 6573

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We demonstrate the feasibility of the realtime implementa- tion of advanced binaural noise reduction algorithms in a single-chip computer called Raspberry Pi. The implementa- tion of the considered algorithms is realized in Simulink, a graphical programming add-on to the integrated development environment Matlab. Using a complementary support pack- age for Simulink, the Raspberry Pi is connected/hosted...

chapter

Speaker localization in reverberant rooms based on direct path dominance test statistics

Boaz Rafaely, Dorothea Kolossa

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 6120 - 6124

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Speaker localization using microphone arrays is typically based on the expected phase and amplitude differences between microphones as a function of the wave arrival direction. However, in rooms with significant reverberation, the direct sound is contaminated by reflections and localization often fails. Recently, a reverberation-robust localization method was proposed, which uses only the direct-path...

chapter

Towards confidence measures on fundamental frequency estimations

Boyuan Deng, Denis Jouvet, Yves Laprie, Ingmar Steiner, more

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5605 - 5609

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The fundamental frequency is one of the prosodic parameters, and many algorithms have been developed for estimating the fundamental frequency of speech signals. Most of them provide good results on good quality speech signals, but their performance degrades when dealing with noisy signals. Moreover, although some provide a probability for the voicing decision, none of them indicate how reliable the...

chapter

Phonological content impact on wrongful convictions in Forensic Voice Comparison context

Moez Ajili, Jean-Francois Bonastre, Waad Ben Kheder, Solange Rossato, more

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2147 - 2151

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Forensic Voice Comparison (FVC) is increasingly using the likelihood ratio (LR) in order to indicate whether the evidence supports the prosecution (same-speaker) or defender (different-speakers) hypotheses. Nevertheless, the LR accepts some practical limitations due both to its estimation process itself and to a lack of knowledge about the reliability of this (practical) estimation process. It is...

chapter

A novel pitch extraction based on jointly trained deep BLSTM Recurrent Neural Networks with bottleneck features

Bin Liu, Jianhua Tao, Dawei Zhang, Yibin Zheng

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 336 - 340

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pitch is an important characteristic of speech and is useful for many applications. However, it is still challenging to estimate pitch in strong noise. In this paper, we propose a joint training approach to determinate pitch. First, a Bidirectional Long Short-Term Memory Recurrent Neural Networks (BLSTMRNN) is trained to map the noisy to clean speech features. Second, the pitch estimation is also...

chapter

Speech temporal dynamics fusion approaches for noise-robust reverberation time estimation

Mohammed Senoussaoui, Joao F. Santos, Tiago H. Falk

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5545 - 5549

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Reverberation and noise are known to be the two most important culprits for poor performance in far-field speech applications, such as automatic speech recognition. Recent research has suggested that reverberation-aware speech enhancement (or speech technologies, in general) could be used to improve performance. However, recent results also show existing blind room acoustics characterization algorithms...

chapter

Multi-channel signal enhancement with speech and noise covariance estimates computed by a probabilistic localization model

Jorn Anemuller, Hendrik Kayser

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 156 - 160

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Classic approaches to multi-channel signal enhancement rely on model assumptions regarding speech source relative transfer functions and noise covariance matrix, or on estimates thereof obtained in, e.g., speech pauses. To alleviate these constraints, we here investigate an approach to adaptive estimation of the speech (target) source and noise related acoustic parameters based on localized speech...

chapter

Probabilistic spatial dictionary based online adaptive beamforming for meeting recognition in noisy and reverberant environments

Nobutaka Ito, Shoko Araki, Marc Delcroix, Tomohiro Nakatani

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 681 - 685

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Here we propose online adaptive beamforming for automatic speech recognition (ASR) in meetings in noisy, reverberant environments. The proposed method is based on recently developed mask-based beamforming, in which accurate mask estimation and diarization are paramount. Real-world experiments have shown that mask-based beamforming enables accurate ASR in meetings in small noise and reverberation with...

chapter

Single-channel Wiener filtering of deterministic signals in stochastic noise using the panorama

Scott C. Douglas, Danilo P. Mandic

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4182 - 4186

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The Wiener filter is a well-known signal processing method for improving a noisy signal's quality. The Wiener filter requires either knowledge of or estimates of the power spectra of the signal-of-interest and of the undesired noise, leading to implementation challenges. In this paper, we show how a recently-developed second-order signal quantity termed the panorama can be employed to compute the...

chapter

Speech dereverberation and denoising using complex ratio masks

Donald S. Williamson, DeLiang Wang

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5590 - 5594

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Traditional speech separation systems enhance the magnitude response of noisy speech. Recent studies, however, have shown that perceptual speech quality is significantly improved when magnitude and phase are both enhanced. These studies, however, have not determined if phase enhancement is beneficial in environments that contain reverberation as well as noise. In this paper, we present an approach...

chapter

Modification on LSA speech enhancement for speech recognition

Chang Huai You, Bin Ma, Chongjia Ni

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5475 - 5479

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Speech recognition performance deteriorates in face of unknown noise. Speech enhancement offers a solution by reducing the noise in speech at runtime. However, it also introduces artificial distortions to the speech signals. In this paper, we aim at reducing the artifacts that has adverse effects on speech recognition. With this motivation, we propose a modification scheme including smoothing adaptation...

chapter

Automatic assessment of dysarthria severity level using audio descriptors

Chitralekha Bhat, Bhavik Vachhani, Sunil Kumar Kopparapu

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5070 - 5074

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Dysarthria is a motor speech impairment, often characterized by speech that is generally indiscernible by human listeners. Assessment of the severity level of dysarthria provides an understanding of the patient's progression in the underlying cause and is essential for planning therapy, as well as improving automatic dysarthric speech recognition. In this paper, we propose a non-linguistic manner...

chapter

Multi-channel noise reduction for hands-free voice communication on mobile phones

Wenyu Jin, Mohammad J. Taghizadeh, Kainan Chen, Wei Xiao

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 506 - 510

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Noise reduction technologies have been applied to enhance the intelligibility of voice communications. However, existing methods are vulnerable to complex non-stationary noisy conditions, which are commonly encountered in real world hands-free scenarios. Additionally, the existing methods do not fully take the advantage of the deployment of multi-channel microphone arrays on the burgeoning high-end...

chapter

e-vectors: JFA and i-vectors revisited

Sandro Cumani, Pietro Laface

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5435 - 5439

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Systems based on i-vectors represent the current state-of-the-art in text-independent speaker recognition. In this work we introduce a new compact representation of a speech segment, similar to the speaker factors of Joint Factor Analysis (JFA) and to i-vectors, that we call “e-vector”. The e-vectors derive their name from the eigenvoice space of the JFA speaker modeling approach. Our working hypothesis...

chapter

Improved cepstra minimum-mean-square-error noise reduction algorithm for robust speech recognition

Jinyu Li, Yan Huang, Yifan Gong

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4865 - 4869

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In the era of deep learning, although beam-forming multi-channel signal processing is still very helpful, it was reported that single-channel robust front-ends usually cannot benefit deep learning models because the layer-by-layer structure of deep learning models provides a feature extraction strategy that automatically derives powerful noise-resistant features from primitive raw data for senone...

chapter

On time-frequency mask estimation for MVDR beamforming with application in robust speech recognition

Xiong Xiao, Shengkui Zhao, Douglas L. Jones, Eng Siong Chng, more

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 3246 - 3250

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Acoustic beamforming has played a key role in the robust automatic speech recognition (ASR) applications. Accurate estimates of the speech and noise spatial covariance matrices (SCM) are crucial for successfully applying the minimum variance distortionless response (MVDR) beamforming. Reliable estimation of time-frequency (TF) masks can improve the estimation of the SCMs and significantly improve...

INFONA - science communication portal

Search results

Speech enhancement using new iterative minimum statistics approach

Pitch prediction from Mel-frequency cepstral coefficients using sparse spectrum recovery

Real-time distributed speech enhancement with two collaborating microphone arrays

Real-time implementation of hearing aid with combined noise and acoustic feedback reduction based on smartphone

Realtime binaural speech enhancement demo on raspberry Pi

Speaker localization in reverberant rooms based on direct path dominance test statistics

Towards confidence measures on fundamental frequency estimations

Phonological content impact on wrongful convictions in Forensic Voice Comparison context

A novel pitch extraction based on jointly trained deep BLSTM Recurrent Neural Networks with bottleneck features

Speech temporal dynamics fusion approaches for noise-robust reverberation time estimation

Multi-channel signal enhancement with speech and noise covariance estimates computed by a probabilistic localization model

Probabilistic spatial dictionary based online adaptive beamforming for meeting recognition in noisy and reverberant environments

Single-channel Wiener filtering of deterministic signals in stochastic noise using the panorama

Speech dereverberation and denoising using complex ratio masks

Modification on LSA speech enhancement for speech recognition

Automatic assessment of dysarthria severity level using audio descriptors

Multi-channel noise reduction for hands-free voice communication on mobile phones

e-vectors: JFA and i-vectors revisited

Improved cepstra minimum-mean-square-error noise reduction algorithm for robust speech recognition

On time-frequency mask estimation for MVDR beamforming with application in robust speech recognition

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options