Search results for: Martin Graciarena

Items from 1 to 14 out of 14 results

chapter

Spotting Audio-Visual Inconsistencies (SAVI) in Manipulated Video

Robert Bolles, J. Brian Burns, Martin Graciarena, Andreas Kathol, more

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 1907 - 1914

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

This paper is part of a larger effort to detect manipulations of video by searching for and combining the evidence of multiple types of inconsistencies between the audio and visual channels. Here, we focus on inconsistencies between the type of scenes detected in the audio and visual modalities (e.g., audio indoor, small room versus visual outdoor, urban), and inconsistencies in speaker identity tracking...

chapter

Speech recognition in unseen and noisy channel conditions

Vikramjit Mitra, Horacio Franco, Chris Bartels, Julien van Hout, more

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5215 - 5219

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Speech recognition in varying background conditions is a challenging problem. Acoustic condition mismatch between training and evaluation data can significantly reduce recognition performance. For mismatched conditions, data-adaptation techniques are typically found to be useful, as they expose the acoustic model to the new data condition(s). Supervised adaptation techniques usually provide substantial...

chapter

A phonetically aware system for speech activity detection

Luciana Ferrer, Martin Graciarena, Vikramjit Mitra

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5710 - 5714

ICASSP 2016 - 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Speech activity detection (SAD) is an essential component of most speech processing tasks and greatly influences the performance of the systems. Noise and channel distortions remain a challenge for SAD systems. In this paper, we focus on a dataset of highly degraded signals, developed under the DARPA Robust Automatic Transcription of Speech (RATS) program. On this challenging data, the best-performing...

chapter

Improving robustness against reverberation for automatic speech recognition

Vikramjit Mitra, Julien Van Hout, Wen Wang, Martin Graciarena, more

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) > 525 - 532

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)

Reverberation is a phenomenon observed in almost all enclosed environments. Human listeners rarely experience problems in comprehending speech in reverberant environments, but automatic speech recognition (ASR) systems often suffer increased error rates under such conditions. In this work, we explore the role of robust acoustic features motivated by human speech perception studies, for building ASR...

chapter

Softsad: Integrated frame-based speech confidence for speaker recognition

Mitchell McLaren, Martin Graciarena, Yun Lei

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4694 - 4698

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper we propose softSAD: the direct integration of speech posteriors into a speaker recognition system as an alternative to using speech activity detection (SAD). Motivated by the need to use audio from short recordings more efficiently, softSAD removes the need to discard audio using speech/non-speech decisions based on a threshold as done with SAD. Instead, softSAD explicitly integrates...

chapter

Medium-duration modulation cepstral feature for robust speech recognition

Vikramjit Mitra, Horacio Franco, Martin Graciarena, Dimitra Vergyri

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 1749 - 1753

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Studies have shown that the performance of state-of-the-art automatic speech recognition (ASR) systems significantly deteriorate with increased noise levels and channel degradations, when compared to human speech recognition capability. Traditionally, noise-robust acoustic features are deployed to improve speech recognition performance under varying background conditions to compensate for the performance...

chapter

Feature fusion for high-accuracy keyword spotting

Vikramjit Mitra, Julien van Hout, Horacio Franco, Dimitra Vergyri, more

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 7143 - 7147

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper assesses the role of robust acoustic features in spoken term detection (a.k.a keyword spotting — KWS) under heavily degraded channel and noise corrupted conditions. A number of noise-robust acoustic features were used, both in isolation and in combination, to train large vocabulary continuous speech recognition (LVCSR) systems, with the resulting word lattices used for spoken term detection...

chapter

Improving speaker identification robustness to highly channel-degraded speech through multiple system fusion

Mitchell McLaren, Nicolas Scheffer, Martin Graciarena, Luciana Ferrer, more

2013 IEEE International Conference on Acoustics, Speech and Signal Processing > 6773 - 6777

ICASSP 2013 - 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This article describes our submission to the speaker identification (SID) evaluation for the first phase of the DARPA Robust Automatic Transcription of Speech (RATS) program. The evaluation focuses on speech data heavily degraded by channel effects. We show here how we designed a robust system using multiple streams of noise-robust features that were combined at a later stage in an i-vector framework...

chapter

Normalized amplitude modulation features for large vocabulary noise-robust speech recognition

Vikramjit Mitra, Horacio Franco, Martin Graciarena, Arindam Mandal

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4117 - 4120

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

Background noise and channel degradations seriously constrain the performance of state-of-the-art speech recognition systems. Studies comparing human speech recognition performance with automatic speech recognition systems indicate that the human auditory system is highly robust against background noise and channel variabilities compared to automated systems. A traditional way to add robustness to...

chapter

Towards noise-robust speaker recognition using probabilistic linear discriminant analysis

Yun Lei, Lukas Burget, Luciana Ferrer, Martin Graciarena, more

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4253 - 4256

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

This work addresses the problem of speaker verification where additive noise is present in the enrollment and testing utterances. We show how the current state-of-the-art framework can be effectively used to mitigate this effect. We first look at the degradation a standard speaker verification system is subjected to when presented with noisy speech waveforms. We designed and generated a corpus with...

chapter

Bird species recognition combining acoustic and sequence modeling

Martin Graciarena, Michelle Delplanche, Elizabeth Shriberg, Andreas Stolcke

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 341 - 344

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The goal of this work was to explore modeling techniques to improve bird species classification from audio samples. We first developed an unsupervised approach to obtain approximate note models from acoustic features. From these note models we created a bird species recognition system by leveraging a phone n-gram statistical model developed for speaker recognition applications. We found competitive...

chapter

The SRI NIST 2010 speaker recognition evaluation system

Nicolas Scheffer, Luciana Ferrer, Martin Graciarena, Sachin Kajarekar, more

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5292 - 5295

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The SRI speaker recognition system for the 2010 NIST speaker recognition evaluation (SRE) incorporates multiple subsystems with a variety of features and modeling techniques. We describe our strategy for this year's evaluation, from the use of speech recognition and speech segmentation to the individual system descriptions as well as the final combination. Our results show that under most conditions,...

chapter

Acoustic front-end optimization for bird species recognition

Martin Graciarena, Michelle Delplanche, Elizabeth Shriberg, Andreas Stolcke, more

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 293 - 296

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

The goal of this work was to explore the optimization of the feature extraction module (front-end) parameters to improve bird species recognition. We explored optimizing the spectral and temporal parameters of a Mel cepstrum feature-based front-end, starting from common parameter values used in speech processing experiments. These features were modeled using a Gaussian mixture model (GMM) system....

chapter

Noise Robust Speaker Identification for Spontaneous Arabic Speech

Martin Graciarena, Sachin Kajarekar, Andreas Stolcke, Elizabeth Shriberg

2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '7 > 4 > IV-245 - IV-248

2007 IEEE International Conference on Acoustics, Speech, and Signal Processing

Two important challenges for speaker recognition applications are noise robustness and portability to new languages. We present an approach that integrates multiple components and models for improved speaker identification in spontaneous Arabic speech in adverse acoustic conditions. We used two different acoustic speaker models: cepstral Gaussian mixture models (GMM) and maximum likelihood linear...

Filter options

Publication date

Set your own date range

Keywords

SPEECH (6)
HIDDEN MARKOV MODELS (5)
ROBUSTNESS (3)
SPEAKER RECOGNITION (3)
SPEECH RECOGNITION (3)
TRAINING (3)
ACOUSTICS (2)
BIRD SPECIES RECOGNITION (2)
BIRDS (2)
BOTTLENECK FEATURES (2)
DATA MODELS (2)
FEATURE EXTRACTION (2)
GAUSSIAN MIXTURE MODEL (2)
I-VECTOR (2)
INDEXES (2)
LARGE VOCABULARY SPEECH RECOGNITION (2)
MODULATION FEATURES (2)
NIST (2)
NOISE (2)
NOISE-ROBUST SPEECH RECOGNITION (2)
SPEAKER IDENTIFICATION (2)
SPEECH ACTIVITY DETECTION (2)
SPEECH PROCESSING (2)
SYSTEM FUSION (2)
ACOUSTIC FRONT-END (1)
ACOUSTIC FRONT-END OPTIMIZATION (1)
ACOUSTIC SIGNAL PROCESSING (1)
ADAPTATION MODEL (1)
ADAPTATION MODELS (1)
ARABIC (1)
AUTO-ENCODERS (1)
AUTOMATIC SPEECH RECOGNITION (1)
BANDWIDTH (1)
CEPSTRAL ANALYSIS (1)
CEPSTRAL GMM (1)
CHANNEL- AND NOISE-ROBUST SPEECH RECOGNITION (1)
COMPUTATIONAL MODELING (1)
DEEP CONVOLUTION NETWORKS (1)
DEEP NEURAL NETWORKS (1)
DEGRADATION (1)
DEGRADED CHANNELS (1)
DEGRADED SPEECH (1)
FACE (1)
FEATURE COMBINATION (1)
FILTER BANK (1)
FILTER BANK DISTRIBUTION (1)
FILTERING THEORY (1)
FRONT-END PARAMETER (1)
GAUSSIAN PROCESSES (1)
GMM SYSTEM (1)
HIGH-LEVEL MODELING (1)
LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION (1)
LINEAR FREQUENCY SCALE (1)
LIPS (1)
MEL CEPSTRUM FEATURE (1)
MEL FREQUENCY CEPSTRAL COEFFICIENT (1)
MEL FREQUENCY SCALE (1)
MISMATCHED CONDITIONS (1)
MIXERS (1)
MLLR-SVM (1)
NOISE MEASUREMENT (1)
NOISE ROBUST KEYWORD SPOTTING (1)
NOISE ROBUSTNESS (1)
OPTIMIZATION (1)
PHONE N-GRAM MODELING (1)
PLDA (1)
PROSODY (1)
RATS (1)
REVERBERATION ROBUSTNESS (1)
ROBUST ACOUSTIC FEATURES (1)
ROBUST FEATURE COMBINATION (1)
ROBUST SPEECH RECOGNITION (1)
SEMANTICS (1)
SIGNAL TO NOISE RATIO (1)
SPEAKER VERIFICATION (1)
SPECTRAL BANDWIDTH (1)
SPECTRAL PARAMETER (1)
SYSTEM COMBINATION (1)
TEMPORAL PARAMETER (1)
TIME-FREQUENCY CONVOLUTION NETS (1)
UNSEEN CONDITIONS (1)
UNSUPERVISED ADAPTATION (1)
VISUALIZATION (1)
ZOOLOGY (1)
more

INFONA - science communication portal

Search results for: Martin Graciarena

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options