Search results

Items from 121 to 140 out of 318 results

1 ...
4
5
6
7
8
9
10

chapter

Controlling the tradeoff property in a regularization framework for noise reduction

Xugang Lu, Masashi Unoki, Shigeki Matsuda, Chiori Hori, more

2012 8th International Symposium on Chinese Spoken Language Processing > 201 - 205

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

The tradeoff between noise reduction and speech distortion is a key concern in designing noise reduction algorithms. We have proposed a regularization framework for noise reduction with the consideration of the tradeoff problem. We regard speech estimation as a functional approximation problem in a reproducing kernel Hilbert space (RKHS). In the estimation, the objective function is formulated to...

chapter

Alternative hypothesis generation using a weighted kernel feature matrix for ASR substitution error correction

Chao-Hong Liu, Chung-Hsien Wu, David Sarwono

2012 8th International Symposium on Chinese Spoken Language Processing > 1 - 5

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

Although automatic speech recognition (ASR) has been successfully used in several applications, it is still non-robust and imprecise especially in a harsh environment wherein the input speech is of low quality. Robust error correction for ASR outputs thus becomes important in addition to improving recognition performance. In recent approaches to error correction, linguistic or domain information is...

chapter

Robust PCA-GMM-SVM System for Speaker Verification Task

Kawthar Yasmine Zergat, Abderrahmane Amrouche, Nassim Asbai, Mohamed Debyeche

2012 Eighth International Conference on Signal Image Technology and Internet Based Systems > 214 - 217

2012 Eighth International Conference on Signal-Image Technology & Internet-Based Systems (SITIS 2012)

This paper presents an automatic speaker verification system based on the hybrid GMM-SVM model working in real environment. An important step in speaker verification is extracting features that best characterized the speaker. Mel-Frequency Cepstral Coefficients (MFCC) and their firt and second derivatives are commonly used as acoustic features for speaker verification. To reduce the high dimensionality...

chapter

Scoring methods for normalized kernels for multi-level speaker verification

Szymon Drgas, Adam Dabrowski

2012 International Conference on Signals and Electronic Systems (ICSES) > 1 - 5

2012 International Conference on Signals and Electronic Systems (ICSES 2012)

In this article the text-independent speaker verification problem is considered. In the presented system each conversation side is represented as a vector lying on the unit hypersphere. These vectors are compared by an inner product which produces similarity scores. In this article classical score normalization methods (z-norm and t-norm) are analyzed and compared with the support vector machines...

chapter

Comparison of vector normalization methods in multi-level speaker verification

Szymon Drgas, Adam Dabrowski

2012 International Conference on Signals and Electronic Systems (ICSES) > 1 - 6

2012 International Conference on Signals and Electronic Systems (ICSES 2012)

In this article a text-independent speaker verification problem is considered. After the feature extraction, each conversation side has been represented as a vector in a fixed dimensional space. In order to reduce an influence of the lengths of utterances and also the channel properties, various vector normalization techniques have been selected from the literature, modified, and tested. Additionally,...

chapter

Generalized Analysis in Sequence Kernel SVM

Li Jie, Liu He-Ping

2012 International Conference on Computer Science and Service System > 1607 - 1610

2012 International Conference on Computer Science and Service System (CSSS)

In the text-independent speaker recognition system, Support Vector Machine (SVM) equipped with sequence kernel has been widely used. In this paper, a generic structure conceiving sequence kernel has been encapsulated and in the structure we make an analytical comparison between two well used sequence kernel system-GMM Super vector Kernel (GSK) and Generalized Linear Discriminant Sequence (GLDS) showing...

chapter

Parameters Optimization and Application of v-Support Vector Machine Based on Particle Swarm Optimization Algorithm

Jing Bai, Xueying Zhang, Peiyun Xue, Jie Wang

2012 International Conference on Computing, Measurement, Control and Sensor Network > 113 - 116

2012 International Conference on Computing, Measurement, Control and Sensor Network (CMCS)

The standard support vector machine (SVM) is a common method of machine learning, the parameters selection of SVM affects the machine learning ability directly. At present, the research on the choice of SVM parameters is still no uniform approach. In order to avoid the difficult problem of selecting parameters, this paper used a deformed SVM, that is, v-SVM, selected parameters of v-SVM by particle...

chapter

Multimodal feature fusion for robust event detection in web videos

Pradeep Natarajan, Shuang Wu, Shiv Vitaladevuni, Xiaodan Zhuang, more

2012 IEEE Conference on Computer Vision and Pattern Recognition > 1298 - 1305

2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Combining multiple low-level visual features is a proven and effective strategy for a range of computer vision tasks. However, limited attention has been paid to combining such features with information from other modalities, such as audio and videotext, for large scale analysis of web videos. In our work, we rigorously analyze and combine a large set of low-level features that capture appearance,...

chapter

Comparison of different multiclass SVM methods for speaker independent phoneme recognition

M. Cutajar, E. Gatt, I. Grech, O. Casha, more

2012 5th International Symposium on Communications, Control and Signal Processing > 1 - 5

2012 5th International Symposium on Communications, Control and Signal Processing (ISCCSP)

Four multiclass Support Vector Machines (SVMs) methods were designed for the task of speaker independent phoneme recognition. These are the All-at-once, One-against-all, One-against-one, and the Directed Acyclic Graph SVM (DAGSVM). The Discrete Wavelet Transform (DWT) 8 frequency band power percentages are used for feature extraction. All tests were carried out on the TIMIT database. Comparable recognition...

chapter

Thai speech assessment based on fractal theory

Montri Phothisonothai, Katsumi Watanabe

2012 9th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology > 1 - 4

2012 9th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON 2012)

This paper proposes the novel method to assess Thai speech based on fractal analysis. The fractal algorithm, namely, Higuchi's method was selected to evaluate the fractal dimension (FD) of segmented speech signals. To show the FD changes in waveform over time, the time-dependent FD (TDFD) was proposed. Probability distribution of TDFDs using kernel density estimation was used as an additional parameter...

chapter

Compensation of channel effects for support vector machines based speaker verification

Cemal Hanilci, Figen Ertas

2012 20th Signal Processing and Communications Applications Conference (SIU) > 1 - 4

2012 20th Signal Processing and Communications Applications Conference (SIU)

In this paper, we analyze the effect of channel compensation technique on support vector machines (SVM) based speaker verification performance and compare with another well-known speaker modeling algorithm Gaussian Mixture Models with universal background model (GMM-UBM). Experiments conducted on NIST 2002 SRE shows that channel compensation considerable improves the speaker verification accuracy.

chapter

On the use of different feature extraction methods for linear and non linear kernels

Imen Trabelsi, Dorra Ben Ayed

2012 6th International Conference on Sciences of Electronics, Technologies of Information and Telecommunications (SETIT) > 797 - 802

2012 6th International Conference on Sciences of Electronic, Technologies of Information and Telecommunications (SETIT)

The speech feature extraction has been a key focus in robust speech recognition research; it significantly affects the recognition performance. In this paper, we first study a set of different feature extraction methods such as linear predictive coding (LPC), mel frequency cepstral coefficient (MFCC) and perceptual linear prediction (PLP) with several features normalization techniques including rasta...

chapter

SVM based Arabic speaker verification system for mobile devices

Abdulrahman Alarifi, Issa Alkurtass, Abdulmalik S. Alsalman

2012 International Conference on Information Technology and e-Services > 1 - 6

2012 International Conference on Information Technology and e-Services (ICITeS)

User authentication is very critical to ensure only allowed users are able to access restricted resources. Voiceprint can be used as a unique password of the user to prove his/her identity. In this paper, we propose a text-dependent speaker verification system for Arabic language. The paper advocates the use of discrete representation of speech signals in terms of Mel-frequency cepstral coefficients...

chapter

Multiple sources' direction finding by using reliable component on phase difference manifold and kernel density estimator

K. Fujimoto, N. Ding, N. Hamada

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2601 - 2604

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper proposes a novel direction-of-arrival estimation method in a general 3-dimensional array configuration for multiple speech signals uttered simultaneously. The method is based on sparseness in the time-frequency representation of speech signal and is applicable to an underdetermined case where the sources outnumber sensors. At first, we introduce a parameterized closed surface to which we...

chapter

The UMD-JHU 2011 speaker recognition system

D Garcia-Romero, X Zhou, D Zotkin, B Srinivasan, more

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4229 - 4232

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

In recent years, there have been significant advances in the field of speaker recognition that has resulted in very robust recognition systems. The primary focus of many recent developments have shifted to the problem of recognizing speakers in adverse conditions, e.g in the presence of noise/reverberation. In this paper, we present the UMD-JHU speaker recognition system applied on the NIST 2010 SRE...

chapter

New techniques for improving the practicality of an SVM-based speech/music classifier

Chungsoo Lim, Seong-Ro Lee, Yeon-Woo Lee, Joon-Hyuk Chang

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 1657 - 1660

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

Variable bit-rate coding introduced for effective utilization of limited communication bandwidth requires accurate classification of input signals. This paper investigates implementation of a support vector machine (SVM)-based speech/music classifier in the selectable mode vocoder (SMV) framework, which is a standard codec adopted by the Third-Generation Partnership Project 2 (3GPP2). A support vector...

chapter

Fast spoken query detection using lower-bound Dynamic Time Warping on Graphical Processing Units

Yaodong Zhang, Kiarash Adl, James Glass

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5173 - 5176

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

In this paper we present a fast unsupervised spoken term detection system based on lower-bound Dynamic Time Warping (DTW) search on Graphical Processing Units (GPUs). The lower-bound estimate and the K nearest neighbor DTW search are carefully designed to fit the GPU parallel computing architecture. In a spoken term detection task on the TIMIT corpus, a 55x speed-up is achieved compared to our previous...

chapter

Efficient speaker search over large populations using kernelized locality-sensitive hashing

Woojay Jeon, Yan-Ming Cheng

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4261 - 4264

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

We propose a novel method of efficiently searching very large populations of speakers, tens of thousands or more, using an utterance comparison model proposed in a previous work. The model allows much more efficient comparison of utterances compared to the traditional Gaussian Mixture Model(GMM)-based approach because of its computational simplicity while maintaining high accuracy. Furthermore, efficiency...

chapter

Spectrogram based features selection using multiple kernel learning for speech/music discrimination

Sharmin Nilufar, Nilanjan Ray, M. K. Islam Molla, Keikichi Hirose

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 501 - 504

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper presents a multiple kernel learning (MKL) approach to speech/music discrimination (SMD). The time-frequency representation (spectrogram) implemented by short-time Fourier transform (STFT) of audio segment is decomposed by wavelet packet transform into different subband levels. The subbands, which contain rich texture information, are used as features for this discrimination problem. MKL...

chapter

A new multiple-kernel-learning weighting method for localizing human brain magnetic activity

T. Takiguchi, T. Imada, R. Takashima, Y. Ariki, more

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 761 - 764

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper shows that pattern classification based on machine learning is a powerful tool to analyze human brain activity data obtained by magnetoencephalography (MEG). We propose a new weighting method using a multiple kernel learning (MKL) algorithm to localize the brain area contributing to the accurate vowel discrimination. Our MKL simultaneously estimates both the classification boundary and...

1 ...
4
5
6
7
8
9
10

Data set:
ieee
Keywords:
KERNEL
SPEECH

Publication date

Set your own date range

Content availability

Available (317)
None (1)

Publication type

book (269)
article (49)

Keywords

SUPPORT VECTOR MACHINES (149)
SPEECH RECOGNITION (102)
TRAINING (101)
FEATURE EXTRACTION (93)
SPEAKER RECOGNITION (61)
SPEECH PROCESSING (46)
MEL FREQUENCY CEPSTRAL COEFFICIENT (41)
SUPPORT VECTOR MACHINE (38)
HIDDEN MARKOV MODELS (37)
DATA MINING (35)
VECTORS (34)
ACCURACY (30)
NOISE (29)
ACOUSTICS (28)
GAUSSIAN PROCESSES (27)
EMOTION RECOGNITION (26)
NIST (23)
SVM (23)
DATABASES (22)
PRINCIPAL COMPONENT ANALYSIS (21)
ROBUSTNESS (20)
ESTIMATION (19)
GAUSSIAN MIXTURE MODEL (18)
LEARNING (ARTIFICIAL INTELLIGENCE) (18)
PATTERN CLASSIFICATION (17)
SPEAKER VERIFICATION (17)
ADAPTATION MODEL (15)
COMPUTATIONAL MODELING (13)
ARTIFICIAL NEURAL NETWORKS (12)
SUPPORT VECTOR MACHINE CLASSIFICATION (12)
MACHINE LEARNING (11)
NATURAL LANGUAGE PROCESSING (11)
SPEECH ENHANCEMENT (11)
ENCODING (10)
MATHEMATICAL MODEL (10)
MFCC (10)
OPTIMIZATION (10)
POLYNOMIALS (10)
SIGNAL CLASSIFICATION (10)
SPEAKER IDENTIFICATION (10)
TRAINING DATA (10)
CLASSIFICATION ALGORITHMS (9)
NOISE MEASUREMENT (9)
SIGNAL TO NOISE RATIO (9)
SPEECH EMOTION RECOGNITION (9)
CEPSTRAL ANALYSIS (8)
CLASSIFICATION (8)
CORRELATION (8)
COVARIANCE MATRIX (8)
ERROR ANALYSIS (8)
GAUSSIAN MIXTURE MODELS (8)
MICROPHONES (8)
PROBABILISTIC LOGIC (8)
REGRESSION ANALYSIS (8)
REPRODUCING KERNEL HILBERT SPACE (8)
REVERBERATION (8)
SIGNAL PROCESSING (8)
SPECTROGRAM (8)
STATISTICAL ANALYSIS (8)
TESTING (8)
TIME-FREQUENCY ANALYSIS (8)
ACOUSTIC SIGNAL PROCESSING (7)
ADAPTIVE FILTERS (7)
AUDIO SIGNAL PROCESSING (7)
DATA MODELS (7)
ENTROPY (7)
EQUATIONS (7)
KERNEL FUNCTION (7)
MULTIPLE KERNEL LEARNING (7)
NONLINEAR FILTERS (7)
RADIAL BASIS FUNCTION NETWORKS (7)
SPEECH SIGNAL (7)
APPROXIMATION METHODS (6)
CLUSTERING ALGORITHMS (6)
COVARIANCE MATRICES (6)
HILBERT SPACES (6)
I-VECTOR (6)
NOISE REDUCTION (6)
SPEECH SYNTHESIS (6)
TRANSIENT NOISE (6)
VOICE ACTIVITY DETECTION (6)
ADAPTATION MODELS (5)
ALGORITHM DESIGN AND ANALYSIS (5)
CEPSTRUM (5)
CONFERENCES (5)
DECODING (5)
DELAY (5)
ECHO SUPPRESSION (5)
FUZZY SET THEORY (5)
KERNEL METHODS (5)
PROBABILITY (5)
PROBABILITY DENSITY FUNCTION (5)
SUPPORT VECTOR MACHINE (SVM) (5)
TRANSIENT ANALYSIS (5)
VECTOR QUANTIZATION (5)
WAVELET TRANSFORMS (5)
ADAPTIVE VOLTERRA FILTER (4)
BANDWIDTH (4)
more

INFONA - science communication portal

Search results

Controlling the tradeoff property in a regularization framework for noise reduction

Alternative hypothesis generation using a weighted kernel feature matrix for ASR substitution error correction

Robust PCA-GMM-SVM System for Speaker Verification Task

Scoring methods for normalized kernels for multi-level speaker verification

Comparison of vector normalization methods in multi-level speaker verification

Generalized Analysis in Sequence Kernel SVM

Parameters Optimization and Application of v-Support Vector Machine Based on Particle Swarm Optimization Algorithm

Multimodal feature fusion for robust event detection in web videos

Comparison of different multiclass SVM methods for speaker independent phoneme recognition

Thai speech assessment based on fractal theory

Compensation of channel effects for support vector machines based speaker verification

On the use of different feature extraction methods for linear and non linear kernels

SVM based Arabic speaker verification system for mobile devices

Multiple sources' direction finding by using reliable component on phase difference manifold and kernel density estimator

The UMD-JHU 2011 speaker recognition system

New techniques for improving the practicality of an SVM-based speech/music classifier

Fast spoken query detection using lower-bound Dynamic Time Warping on Graphical Processing Units

Efficient speaker search over large populations using kernelized locality-sensitive hashing

Spectrogram based features selection using multiple kernel learning for speech/music discrimination

A new multiple-kernel-learning weighting method for localizing human brain magnetic activity

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options