Search results

Items from 1 to 19 out of 19 results

chapter

Locality sensitive discriminant analysis for speaker verification

Danwei Cai, Weicheng Cai, Zhidong Ni, Ming Li

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 5

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In this paper, we apply Locality Sensitive Discriminant Analysis (LSDA) to speaker verification system for intersession variability compensation. As opposed to LDA which fails to discover the local geometrical structure of the data manifold, LSDA finds a projection which maximizes the margin between i-vectors from different speakers at each local area. Since the number of samples varies in a wide...

chapter

Low complexity language recognition exploiting ensemble of random subspace

Om Prakash Singh, Rohit Sinha

2016 International Conference on Signal Processing and Communications (SPCOM) > 1 - 5

2016 International Conference on Signal Processing and Communications (SPCOM)

The current approaches for spoken language recognition (LR) are predominantly based on GMM mean supervector as the representation of the utterances. It is assumed that the language information lies in a linear manifold of low dimensional spaces. Exploiting that a low dimensional projections of the GMM mean supervectors, known as i-vectors, are derived using a total variability matrix. The i-vector...

chapter

A spectrum smoothing method for speaker verification

Zhaofeng Zhang, Jing Deng, Longbiao Wang, Xiong Xiao

2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1291 - 1295

2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In speech processing, speech signal is usually processed frame by frame due to the non-stationary characteristic of speech. In this paper, a frequency-domain averaging based frame smoothing method is proposed. Besides the conventional frame shift, we introduce a short time shift to create several frames around current frame. Then we take the average of power spectrum for these frames. The average...

chapter

Scalable I-vector concatenation for PLDA based language identification system

Saad Irtza, Haris Bavattichalil, Vidhyasaharan Sethu, Eliathamby Ambikairajah

2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1182 - 1185

2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Language identification systems combining i-vectors estimated from different acoustic feature spaces have recently been shown to be superior to i-vector systems based on a single acoustic feature space. Specifically, i-vectors estimated using MFCC and PLP front-ends were concatenated prior to using LDA to obtain a combined i-vector. In this work, we investigate the scalability of this i-vector concatenation...

chapter

Analysis of linear prediction residual signal, its magnitude and phase for language identification on NIST LRE (2003) database

Arup Kumar Dutta, K. Sreenivasa Rao

2015 International Conference on Computer, Communication and Control (IC4) > 1 - 4

2015 International Conference on Computer, Communication and Control (IC4)

The present work investigates the importance of excitation source features for language identification (LID). Linear prediction residual (LPR) represents the excitation source signal. By processing the LPR in sub-segmental, segmental and supra-segmental levels, we can get the language specific information present within a glottal cycle, within a sequence of a few glottal cycles and at the prosody...

chapter

Improved speaker recognition using DCT coefficients as features

Mitchell McLaren, Yun Lei

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4430 - 4434

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We recently proposed the use of coefficients extracted from the 2D discrete cosine transform (DCT) of log Mel filter bank energies to improve speaker recognition over the traditional Mel frequency cepstral coefficients (MFCC) with appended deltas and double deltas (MFCC/deltas). Selection of relevant coefficients was shown to be crucial, resulting in the proposal of a zig-zag parsing strategy. While...

chapter

Different aspects of source information for limited data speaker verification

Rohan Kumar Das, Debadatta Pati, S. R. Mahadeva Prasanna

2015 Twenty First National Conference on Communications (NCC) > 1 - 6

2015 Twenty First National Conference on Communications (NCC)

Limited data speaker verification has shown its significance in practical system oriented applications. The paper shows the importance of different aspects of voice source feature for limited test data scenario. A baseline speaker verification system using conventional mel frequency cepstral co-efficients (MFCC) feature is developed and performance under limited test data condition (≤10 s) is evaluated...

chapter

The 2013 speaker recognition evaluation in mobile environment

E. Khoury, B. Vesnicer, J. Franco-Pedroso, R. Violato, more

2013 International Conference on Biometrics (ICB) > 1 - 8

2013 International Conference on Biometrics (ICB)

This paper evaluates the performance of the twelve primary systems submitted to the evaluation on speaker verification in the context of a mobile environment using the MOBIO database. The mobile environment provides a challenging and realistic test-bed for current state-of-the-art speaker verification techniques. Results in terms of equal error rate (EER), half total error rate (HTER) and detection...

chapter

Spectro-temporal Gabor features for speaker recognition

Howard Lei, Bernd T. Meyer, Nikki Mirghafori

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4241 - 4244

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

In this work, we have investigated the performance of 2D Gabor features (known as spectro-temporal features) for speaker recognition. Gabor features have been used mainly for automatic speech recognition (ASR), where they have yielded improvements. We explored different Gabor feature implementations, along with different speaker recognition approaches, on ROSSI [1] and NIST SRE08 databases. Using...

chapter

Well-calibrated heavy tailed Bayesian speaker verification for microphone speech

Mohammed Senoussaoui, Patrick Kenny, Pierre Dumouchel, Fabio Castaldo

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4824 - 4827

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The work presented in this paper is an extension of our two previous works [1, 2]. In the first paper [1], we proposed a low dimensional feature (i-vectors) extractor which is suitable for both telephone and microphone data of the NIST speaker recognition evaluation dataset. The second paper [2] introduces the use of Probabilistic Linear Discriminant Analysis (PLDA) framework with a heavy tailed distribution...

chapter

Speaker verification using sparse representation classification

Jia Min Karen Kua, Eliathamby Ambikairajah, Julien Epps, Roberto Togneri

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4548 - 4551

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Sparse representations of signals have received a great deal of attention in recent years, and the sparse representation classifier has very lately appeared in a speaker recognition system. This approach represents the (sparse) GMM mean supervector of an unknown speaker as a linear combination of an over-complete dictionary of GMM supervectors of many speaker models, and ℓ₁-norm minimization results...

chapter

Perceptual MVDR-based cepstral coefficients (PMCCs) for speaker recognition

Chunyan Liang, Xiang Zhang, Lin Yang, Jianping Zhang, more

IEEE 10th INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS > 1386 - 1389

2010 10th International Conference on Signal Processing (ICSP 2010)

Acoustic feature extraction from speech is a fundamental part in both automatic speech recognition and automatic speaker recognition. Mel-frequency cepstral coefficients (MFCCs) are widely used in both of the above two research directions. A new feature extraction technique named perceptual MVDR-based cepstral coefficients (PMCCs) has been demonstrated to perform superior in automatic speech recognition...

chapter

Multistream speaker diarization beyond two acoustic feature streams

D Vijayasenan, F Valente, H Bourlard

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4950 - 4953

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

Speaker diarization for meetings data are recently converging towards multistream systems. The most common complementary features used in combination with MFCC are Time Delay of Arrival (TDOA). Also other features have been proposed although, there are no reported improvements on top of MFCC+TDOA systems. In this work we investigate the combination of other feature sets along with MFCC+TDOA. We discuss...

chapter

Speaker information from subband energies of Linear Prediction residual

D. Pati, S.R.M. Prasanna

2010 National Conference On Communications (NCC) > 1 - 4

2010 National Conference on Communications (NCC 2010)

The objective of this work is to demonstrate the significant speaker information present in the subband energies of the Linear Prediction (LP) residual. The LP residual mostly contains the excitation source information. The subband energies extracted using the mel filterbank followed by cepstral analysis provides a compact representation. The resulting cepstral values are termed as Residual-mel Frequency...

chapter

On separating glottal source and vocal tract information in telephony speaker verification

T. Kinnunen, P. Alku

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4545 - 4548

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

The popular mel-frequency cepstral coefficients (MFCCs) capture a mixture of speaker-related, phonemic and channel information. Speaker-related information could be further broken down according to articulatory criteria. How these underlying components are exactly mixed in the features is not well understood. To this end, in this paper we aim at separating the spectra of glottal source and vocal tract...

chapter

Fusing short term and long term features for improved speaker diarization

A.G. Friedland, B.O. Vinyals, C.Y. Huang, D.C. Muller

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4077 - 4080

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

The following article shows how a state-of-the-art speaker diarization system can be improved by combining traditional short-term features (MFCCs) with prosodic and other long-term features. First, we present a framework to study the speaker discriminability of 70 different long-term features. Then, we show how the top-ranked long-term features can be combined with short-term features to increase...

chapter

Evaluation of a fused FM and cepstral-based speaker recognition system on the NIST 2008 SRE

M. Nosratighods, T. Thiruvaran, J. Epps, E. Ambikairajah, more

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4233 - 4236

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

In this paper, the fusion of two speaker recognition subsystems, one based on frequency modulation (FM) and another on MFCC features, is reported. The motivation for their fusion was to improve the recognition accuracy across different types of channel variations, since the two features are believed to contain complementary information. It was found that the MFCC-based subsystem outperformed the FM-based...

chapter

Contour modeling of prosodic and acoustic features for speaker recognition

M. Kockmann, L. Burget

2008 IEEE Spoken Language Technology Workshop > 45 - 48

2008 IEEE Workshop on Spoken Language Technology. SLT 2008

In this paper we use acoustic and prosodic features jointly in a long-temporal lexical context for automatic speaker recognition from speech. The contours of pitch, energy and cepstral coefficients are continuously modeled over the time span of a syllable to capture the speaking style on phonetic level. As these features are affected by session variability, established channel compensation techniques...

chapter

Auditory features with vocal track length normalization for language identification

Weiqiang Zhang, Jia Liu, Liang He

2008 International Conference on Audio, Language and Image Processing > 66 - 70

2008 International Conference on Audio, Language and Image Processing

This paper reports on a novel feature, auditory cepstrum coefficient (ACC) with vocal tract length normalization (VTLN), for language identification (LID). The ACC feature is based on the auditory characteristics of human ear and the VTLN technology compensates the speaker variability. The detailed implementation of ACC feature with VTLN in frequency domain is given. Experimental results show that...

Filter options

Data set:
ieee
Keywords:
FEATURE EXTRACTION
MEL FREQUENCY CEPSTRAL COEFFICIENT
NIST

Publication date

Set your own date range

Keywords

SPEECH (15)
SPEAKER RECOGNITION (11)
CEPSTRAL ANALYSIS (7)
MFCC (5)
DATABASES (4)
SPEAKER VERIFICATION (4)
TRAINING (4)
GAUSSIAN MIXTURE MODEL (3)
ACCURACY (2)
ACOUSTIC SIGNAL PROCESSING (2)
AUTOMATIC SPEAKER RECOGNITION (2)
DISCRETE COSINE TRANSFORMS (2)
FREQUENCY MODULATION (2)
GAUSSIAN PROCESSES (2)
GMM (2)
HIDDEN MARKOV MODELS (2)
MICROPHONES (2)
PRINCIPAL COMPONENT ANALYSIS (2)
PROSODY (2)
SPEAKER DIARIZATION (2)
SPEECH RECOGNITION (2)
2D-DCT (1)
ACOUSTIC FEATURE EXTRACTION (1)
ACOUSTIC FEATURE STREAM (1)
ACOUSTIC SHORT-TIME SYSTEM (1)
ACOUSTIC STREAMING (1)
ACOUSTICS (1)
ADAPTATION MODELS (1)
ARTICULATORY CRITERIA (1)
AUDITORY CEPSTRUM COEFFICIENT FEATURE EXTRACTION (1)
AUSTRALIA (1)
AUTOMATIC SPEECH RECOGNITION (1)
BAND PASS FILTERS (1)
CALIBRATION (1)
CEPSTRAL VALUES (1)
CEPSTRAL-BASED SPEAKER RECOGNITION SYSTEM (1)
CHANNEL BANK FILTERS (1)
CHANNEL COMPENSATION (1)
CHANNEL INFORMATION (1)
CHANNEL VARIABILITY EFFECT (1)
CHANNEL VARIATIONS (1)
CLUSTERING ALGORITHMS (1)
COMBDEV (1)
COMPLEXITY THEORY (1)
COMPRESSIVE SENSING (1)
COMPUTATIONAL MODELING (1)
CONTEXTUALIZATION (1)
CORRELATION (1)
CORRELATION COEFFICIENT (1)
DATA MINING (1)
DCTILPR (1)
DECISION SUPPORT SYSTEMS (1)
DELAYS (1)
DELTAS (1)
DENSITY ESTIMATION ROBUST ALGORITHM (1)
DIARIZATION ERROR RATE (1)
DIARIZATION-INDEPENDENT SPEAKER-DISCRIMINABILITY (1)
DYNAMIC RANGE (1)
EER (1)
EXCITATION SOURCE (1)
EXCITATION SOURCE INFORMATION (1)
FEATURE COMBINATION (1)
FFT SPECTRUM (1)
FILTER BANK (1)
FILTERBANK ENERGIES (1)
FILTERING THEORY (1)
FRAME SMOOTHING (1)
FREQUENCY DOMAIN ANALYSIS (1)
FREQUENCY-DOMAIN ANALYSIS (1)
FUSES (1)
FUSION (1)
GABOR FEATURES (1)
GAUSSIAN DISTRIBUTION (1)
GLOTTAL INVERSE FILTERING (1)
GLOTTAL SOURCE (1)
HANDHELD COMPUTERS (1)
HEAVY TAILED DISTRIBUTION (1)
I-VECTOR (1)
I-VECTORS (1)
INFORMATION BOTTLENECK PRINCIPLE (1)
INTERVIEWS (1)
JFA (1)
JOINT FACTOR ANALYSIS (1)
LANGUAGE IDENTIFICATION (1)
LINEAR PREDICTION RESIDUAL (1)
LINEAR PROGRAMMING (1)
LOADING (1)
LOGISTICS (1)
LONG-TEMPORAL LEXICAL CONTEXT (1)
LONG-TERM FEATURES (1)
M-PDSS (1)
MANIFOLDS (1)
MARKET RESEARCH (1)
MATHEMATICAL MODEL (1)
MEL FILTERBANK (1)
MEL-FREQUENCY CEPSTRAL COEFFICIENT (1)
MEL-FREQUENCY CEPSTRAL COEFFICIENTS (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options