Search results for: Michael I Mandel

Items from 1 to 9 out of 9 results

chapter

Combining spectral feature mapping and multi-channel model-based source separation for noise-robust automatic speech recognition

Deblin Bagchi, Michael I. Mandel, Zhongqiu Wang, Yanzhang He, more

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) > 496 - 503

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)

Automatic Speech Recognition systems suffer from severe performance degradation in the presence of myriad complicating factors such as noise, reverberation, multiple speech sources, multiple recording devices, etc. Previous challenges have sparked much innovation when it comes to designing systems capable of handling these complications. In this spirit, the CHiME-3 challenge presents system builders...

chapter

Audio super-resolution using concatenative resynthesis

Michael I Mandel, Young Suk Cho

2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 1 - 5

2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

This paper utilizes a recently introduced non-linear dictionary-based denoising system in another voice mapping task, that of transforming low-bandwidth, low-bitrate speech into high-bandwidth, high-quality speech. The system uses a deep neural network as a learned nonlinear comparison function to drive unit selection in a concatenative synthesizer based on clean recordings. This neural network is...

chapter

Exciting estimated clean spectra for speech resynthesis

Sreyas Srimath Tirumala, Michael I Mandel

2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 1 - 5

2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

Spectral masking techniques are prevalent for noise suppression but they damage speech in regions of the spectrum where both noise and speech are present. This paper instead utilizes a recently introduced analysis-by-synthesis technique to estimate the spectral envelope of the speech at all frequencies, and adds to it a model of the speech excitation necessary to fully resynthesize a clean speech...

chapter

Enforcing consistency in spectral masks using Markov random fields

Michael I Mandel, Nicoleta Roman

2015 23rd European Signal Processing Conference (EUSIPCO) > 2028 - 2032

2015 23rd European Signal Processing Conference (EUSIPCO)

Localization-based multichannel source separation algorithms typically operate by clustering or classifying individual time-frequency points based on their spatial characteristics, treating adjacent points as independent observations. The Model-based EM Source Separation and Localization (MESSL) algorithm is one such approach for binaural signals that achieves additional robustness by enforcing consistency...

chapter

Learning a concatenative resynthesis system for noise suppression

Michael I. Mandel, Young Suk Cho, Yuxuan Wang

2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP) > 582 - 586

2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP)

This paper introduces a new approach to dictionary-based source separation employing a learned non-linear metric. In contrast to existing parametric source separation systems, this model is able to utilize a rich dictionary of speech signals. In contrast to previous dictionary-based source separation systems, the system can utilize perceptually relevant non-linear features of the noisy and clean audio...

chapter

Analysis-by-synthesis feature estimation for robust automatic speech recognition using spectral masks

Michael I Mandel, Arun Narayanan

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2509 - 2513

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Spectral masking is a promising method for noise suppression in which regions of the spectrogram that are dominated by noise are attenuated while regions dominated by speech are preserved. It is not clear, however, how best to combine spectral masking with the non-linear processing necessary to compute automatic speech recognition features. We propose an analysis-by-synthesis approach to automatic...

chapter

Learning an intelligibility map of individual utterances

Michael I. Mandel

2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics > 1 - 4

2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

Predicting the intelligibility of noisy recordings is difficult and most current algorithms only aim to be correct on average across many recordings. This paper describes a listening test paradigm and associated analysis technique that can predict the intelligibility of a specific recording of a word in the presence of a specific noise instance. The analysis learns a map of the importance of each...

chapter

Characterizing singing voice fundamental frequency trajectories

Johanna C. Devaney, Michael I. Mandel, Ichiro Fujinaga

2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 73 - 76

2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

This paper evaluates the utility of the Discrete Cosine Transform (DCT) for characterizing singing voice fundamental frequency (F₀) trajectories. Specifically, it focuses on the use of the 1^st and 2^nd DCT coefficients as approximations of slope and curvature. It also considers the impact of vocal vibrato on the DCT calculations, including the influence of segmentation on the consistency of the reported...

chapter

EM Localization and Separation using Interaural Level and Phase Cues

Michael I. Mandel, Daniel P. W. Ellis

2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics > 275 - 278

2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

We describe a system for localizing and separating multiple sound sources from a reverberant two-channel recording. It consists of a probabilistic model of interaural level and phase differences and an EM algorithm for finding the maximum likelihood parameters of this model. By assigning points in the interaural spectrogram probabilistically to sources with the best-fitting parameters and then estimating...

Filter options

Publication type:
book

Publication date

Set your own date range

Keywords

NOISE (3)
SPEECH (3)
ANALYSIS-BY-SYNTHESIS (2)
CONCATENATIVE SYNTHESIS (2)
DEEP NEURAL NETWORKS (2)
DICTIONARIES (2)
NEURAL NETWORKS (2)
NOISE MEASUREMENT (2)
NOISE SUPPRESSION (2)
NONPARAMETRIC (2)
SPEECH PROCESSING (2)
ACOUSTICS (1)
APPROXIMATION ALGORITHMS (1)
APPROXIMATION METHODS (1)
BANDWIDTH (1)
BANDWIDTH EXPANSION (1)
BEAMFORMING (1)
BELIEF PROPAGATION (1)
BINAURAL SEPARATION (1)
CORPUS-BASED (1)
DCT (1)
DECONVOLUTION (1)
DISCRETE COSINE TRANSFORMS (1)
EUROPE (1)
F<INF>0</INF> CHARACTERIZATION (1)
GLIMPSE (1)
INTELLIGIBILITY (1)
LARGE VOCABULARY AUTOMATIC SPEECH RECOGNITION (1)
MARKOV RANDOM FIELDS (1)
MEL FREQUENCY CEPSTRAL COEFFICIENT (1)
MISSING DATA (1)
MULTI-CHANNEL MODEL-BASED SOURCE SEPARATION (1)
NOISE ROBUSTNESS (1)
OBJECTIVE (1)
PACKET LOSS (1)
POLYNOMIALS (1)
ROBUST AUTOMATIC SPEECH RECOGNITION (1)
ROBUSTNESS (1)
SHAPE (1)
SIGNAL PROCESSING ALGORITHMS (1)
SINGING (1)
SOURCE SEPARATION (1)
SPECTRAL FEATURE MAPPING (1)
SPECTRAL MASKING (1)
SPEECH RECOGNITION (1)
SPEECH SYNTHESIS (1)
SUBJECTIVE (1)
TIME-FREQUENCY ANALYSIS (1)
TIME-FREQUENCY MASKING (1)
TRAINING (1)
TRAJECTORY (1)
more

INFONA - science communication portal

Search results for: Michael I Mandel

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options