Advanced search

Advanced search in people

From:

To:

Items from 101 to 120 out of 600 results

1 ...
3
4
5
6
7
8
9

chapter

Multitask speaker profiling for estimating age, height, weight and smoking habits from spontaneous telephone speech signals

Amir Hossein Poorjam, Mohamad Hasan Bahari, Hugo Van hamme

2014 4th International Conference on Computer and Knowledge Engineering (ICCKE) > 7 - 12

2014 4th International eConference on Computer and Knowledge Engineering (ICCKE)

This paper proposes a novel approach for automatic estimation of four important traits of speakers, namely age, height, weight and smoking habit, from speech signals. In this method, each utterance is modeled using the i-vector framework which is based on the factor analysis on Gaussian Mixture Model (GMM) mean supervectors, and the Non-negative Factor Analysis (NFA) framework which is based on a...

chapter

Performance evaluation of single channel speech separation using non-negative matrix factorization

M. Mona Nandakumar, K. Edet Bijoy

2014 IEEE National Conference on Communication, Signal Processing and Networking (NCCSN) > 1 - 4

2014 National Conference on Communication, Signal Processing and Networking (NCCSN)

Blind Source Separation (BSS) of underdetermined mixture has acquired a huge attention in signal processing environment, even though it is very much difficult to separate the underlying sources. The difficulty in source separation arise due to the mixing of large number of source signals in time and frequency, and propagation of it to one or more sensors through air. The objective in BSS is to identify...

chapter

Non-invasive ambulatory monitoring of complex sEMG patterns and its potential application in the detection of vocal dysfunctions

N. R. Smith, T. Klongtruagrok, G. N. DeSouza, C. R. Shyu, more

2014 IEEE 16th International Conference on e-Health Networking, Applications and Services (Healthcom) > 447 - 452

2014 IEEE 16th International Conference on e-Health Networking, Applications and Services (Healthcom 2014)

Voice disorders are non-trivial when it comes to their early detection. Symptoms range from slight hoarseness to complete loss of voice, and may seriously impact personal and professional life. To date, we are still largely missing reliable data to help us better understand and screen voice pathologies. In this paper, we present an ambulatory voice monitoring system using surface electromyography...

chapter

Kernel ridge regression method applied to speech recognition problem: A novel approach

Hoang Trang, Loc Tran

2014 International Conference on Advanced Technologies for Communications (ATC 2014) > 172 - 174

2014 International Conference on Advanced Technologies for Communications (ATC)

Speech recognition is the important problem in pattern recognition research field. In this paper, the kernel ridge regression method is proposed to be applied to the MFCC feature vectors of the speech dataset available from IC Design lab at Faculty of Electricals-Electronics Engineering, University of Technology, Ho Chi Minh City. Experiment results show that the kernel ridge regression method outperforms...

chapter

Improving the recognition of pathological voice using the discriminant HLDA transformation

Othman Lachhab, Joseph Di Martino, El Hassane Ibn Elhaj, Ahmed Hammouch

2014 Third IEEE International Colloquium in Information Science and Technology (CIST) > 370 - 373

2014 Third IEEE International Colloquium in Information Science and Technology (CIST)

In this paper, we propose a simple and fast method for evaluating the pathological voice (esophageal) by applying the continuous speech recognition in a speaker dependent mode, on our own database of the pathological voice, we call FPSD (French Pathological Speech Database). The recognition system used is implemented using the HTK platform, based on HMM/GMM monophone models. The acoustic vectors are...

chapter

Multi-level prosody and spectrum conversion for emotional speech synthesis

Zexun Wang, Yibiao Yu

2014 12th International Conference on Signal Processing (ICSP) > 588 - 593

2014 12th International Conference on Signal Processing (ICSP 2014)

Emotional speech can be synthesized by converting prosodic and spectrum features in neutral speech. This paper propose a multi-level prosody conversion method, it converts three prosodic features of F0, short-time energy and speaking rate in syllable, prosodic word and sentence level sequentially. The F0 and speaking rate is modeled by Gaussians, and energy is modeled by Gamma distribution respectively...

chapter

Voice conversion based on matrix variate Gaussian mixture model

Daisuke Saito, Hidenobu Doi, Nobuaki Minematsu, Keikichi Hirose

2014 12th International Conference on Signal Processing (ICSP) > 567 - 571

2014 12th International Conference on Signal Processing (ICSP 2014)

This paper describes a novel approach to construct a mapping function between a given speaker pair using probability density functions (PDF) of matrix variate. In voice conversion studies, two important functions should be realized: 1) precise modeling of both the source and target feature spaces, and 2) construction of a proper transform function between these spaces. Voice conversion based on Gaussian...

chapter

An investigation of implementation and performance analysis of DNN based speech synthesis system

Zhehuai Chen, Kai Yu

2014 12th International Conference on Signal Processing (ICSP) > 577 - 582

2014 12th International Conference on Signal Processing (ICSP 2014)

Deep Neural Network (DNN), which can model hierarchical and complex relationship between input and output layer has recently been applied in speech synthesis. However, it is remained uncertain why DNN outperform traditional HMM-based synthesis. This paper describes several implementation details of DNN-based speech synthesis system and compares different impacting factors, e.g, F0 modeling method...

chapter

Pollination based optimization for feature reduction at feature level fusion of speech & signature biometrics

Gaganpreet Kaur, Dheerendra Singh, Sukhpreet Kaur

Proceedings of 3rd International Conference on Reliability, Infocom Technologies and Optimization > 1 - 6

2014 3rd International Conference on Reliability, Infocom Technologies and Optimization (ICRITO) (Trends and Future Directions)

A scheme for the feature level fusion of two behavioral biometrics speech and signature using fusion method weighted sum is proposed. Feature reduction is performed using modified feature selection algorithm based on Pollination based optimization which has never been applied to the problem earlier. The modified algorithm is applied to the fusion method to search the feature space for optimal and...

chapter

Emarati speaker identification

Ismail Shahin, Mohammed Nasser Ba-Hutair

2014 12th International Conference on Signal Processing (ICSP) > 488 - 493

2014 12th International Conference on Signal Processing (ICSP 2014)

In this work we focus on Emarati speaker identification systems in neutral talking environments based on each of Vector Quantization (VQ), Gaussian Mixture Models (GMMs), and Hidden Markov Models (HMMs) as classifiers. These systems have been tested on our collected Emarati speech database which is composed of 25 male and 25 female Emarati speakers using Mel-Frequency Cepstral Coefficients (MFCCs)...

chapter

Near-field source extraction using speech presence probabilities for ad hoc microphone arrays

Maja Taseska, Shmulik Markovich-Golan, Emanuel A. P. Habets, Sharon Gannot

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC) > 169 - 173

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC)

Ad hoc wireless acoustic sensor networks (WASNs) hold great potential for improved performance in speech processing applications, thanks to better coverage and higher diversity of the received signals. We consider a multiple speaker scenario where each of the WASN nodes, an autonomous system comprising of sensing, processing and communicating capabilities, is positioned in the near-field of one of...

chapter

Towards online source counting in speech mixtures applying a variational EM for complex Watson mixture models

Lukas Drude, Aleksej Chinaev, Dang Hai Tran Vu, Reinhold Haeb-Umbach

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC) > 213 - 217

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC)

This contribution describes a step-wise source counting algorithm to determine the number of speakers in an offline sce-nario. Each speaker is identified by a variational expectation maximization (VEM) algorithm for complex Watson mixture models and therefore directly yields beamforming vectors for a subsequent speech separation process. An observation selection criterion is proposed which improves...

chapter

On the performance of widely linear quaternion based MVDR beamformer for an acoustic vector sensor

Jiuwen Cao, Andy W. H. Khong, Sharon Gannot

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC) > 303 - 307

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC)

Widely linear model has recently been used for signal processing applications due to its ability to achieve better performance than conventional linear filtering for non-circular complex random variables (CRVs) and improper quaternion random variables (QRVs). In this paper, we study the time-domain widely linear quaternion model based minimum variance distortionless response beamformer (WL-QMVDR)...

chapter

Multiple source localisation in the spherical harmonic domain

Christine Evers, Alastair H. Moore, Patrick A. Naylor

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC) > 258 - 262

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC)

Spherical arrays facilitate processing and analysis of sound fields with the potential for high resolution in three dimensions in the spherical harmonic domain. Using the captured sound field, robust source localisation systems are required for speech acquisition, speaker tracking and environment mapping. Source localisation becomes a challenging problem in reverberant environments and under noisy...

chapter

Single-channel speech dereverberation in acoustical environments

Marjan Joorabchi, Seyed Ghorshi, Ali Sarafnia

Proceedings ELMAR-2014 > 1 - 4

2014 56th International Symposium ELMAR

Reverberated speech signals in acoustical environments produces some problems such as reducing speech intelligibility, distinguishing speakers, locating source, quality for hands-free telephony, hearing aid, etc. Adaptive filters can be applied to reduce the reverberation effects or to dereverberate the received speech signals at microphone. In this paper a dereverberation method is proposed by applying...

chapter

Speaker identification using FBCC in Malayalam language

Drisya Vasudev, Anish Babu K. K

2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 1759 - 1763

2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Speaker identification attempts to determine the best possible match from a group of certain speakers, for any given input speech signal. The text-independent speaker identification system does the task to identify the person who speaks regardless of what is said. The first step in speaker identification is the extraction of features. In this proposed method, the Bessel features are used as an alternative...

chapter

A comparison of Multi-Layer Perceptron and Radial Basis Function neural network in the voice conversion framework

Ankita N. Chadha, Jagannath H. Nirmal, Mukesh A. Zaveri

2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 1045 - 1052

2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

The voice conversion system modifies the speaker specific features of the source speaker so that it sounds like a target speaker speech. The voice individuality of the speech signal is characterized at various levels such as shape of the glottal excitation, shape of the vocal tract and the long term prosodic features. In this work, Line Spectral Frequencies (LSF) are used to represent the shape of...

chapter

The acoustic echo cancelation using blind source separation to reduce double-talk interference

Yoshihiro Sakai, Muhammad Tahir Akhtar

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC) > 323 - 326

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC)

An acoustic echo canceler (AEC) is often employed to remove the acoustic echoes generated in hands-free communication systems. The AEC cancels the acoustic echoes by approximating the echo-path with the use of an adaptive filter and subtracting the pseudo echoes generated by the filter from the observed signal. The conventional adaptive algorithm for updating the filter, however, fails in estimation...

chapter

A new structure for acoustic echo cancellation in double-talk scenario using auxiliary filter

Mahfoud Hamidia, Abderrahmane Amrouche

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC) > 253 - 257

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC)

In this paper, a new structure for acoustic echo cancellation is presented. The role of acoustic echo canceller (AEC) is to remove undesirable acoustic echoes in communication systems. However, in double-talk case the performance of the AEC is degraded, thus, a double-talk detector (DTD) must be used for controlling the AEC. A new structure for AEC using an auxiliary adaptive filter is proposed in...

chapter

HMM-based artificial bandwidth extension supported by neural networks

Patrick Bauer, Johannes Abel, Tim Fingscheidt

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC) > 1 - 5

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC)

In telephony applications, artificial bandwidth extension (ABE) can be applied to narrowband (NB) calls for speech quality and intelligibility enhancement. However, high-band extension is challenging due to insufficient mutual information between the lower and upper frequency band in speech. Estimation errors particularly of fricatives /s, z/ are the consequence leading to annoying artifacts, such...

1 ...
3
4
5
6
7
8
9

Keywords:
VECTORS
SPEECH

Publication date

Set your own date range

Content availability

Available (596)
None (4)

Publication type

book (450)
article (150)

Keywords

TRAINING (165)
SPEECH RECOGNITION (154)
FEATURE EXTRACTION (144)
HIDDEN MARKOV MODELS (136)
SPEECH PROCESSING (100)
NOISE (82)
ACOUSTICS (81)
MEL FREQUENCY CEPSTRAL COEFFICIENT (66)
SPEAKER RECOGNITION (57)
ACCURACY (55)
MICROPHONES (49)
DATABASES (44)
ESTIMATION (43)
SUPPORT VECTOR MACHINES (41)
COMPUTATIONAL MODELING (40)
SPEECH ENHANCEMENT (40)
ADAPTATION MODELS (35)
NOISE MEASUREMENT (35)
KERNEL (34)
SIGNAL TO NOISE RATIO (34)
MATHEMATICAL MODEL (30)
SPEECH CODING (30)
CORRELATION (29)
DATA MODELS (28)
NOISE REDUCTION (28)
ROBUSTNESS (26)
SIGNAL PROCESSING ALGORITHMS (26)
SPEECH SYNTHESIS (26)
SPEAKER VERIFICATION (25)
SOURCE SEPARATION (23)
CONVERGENCE (22)
BLIND SOURCE SEPARATION (21)
NIST (21)
PRINCIPAL COMPONENT ANALYSIS (21)
SUPPORT VECTOR MACHINE CLASSIFICATION (21)
CONTEXT (20)
REVERBERATION (20)
TRAINING DATA (20)
VOICE CONVERSION (20)
VECTOR QUANTIZATION (19)
ALGORITHM DESIGN AND ANALYSIS (18)
ARTIFICIAL NEURAL NETWORKS (18)
COVARIANCE MATRICES (18)
DICTIONARIES (18)
EMOTION RECOGNITION (18)
TRANSFORMS (18)
COMPLEXITY THEORY (17)
COVARIANCE MATRIX (17)
EDUCATIONAL INSTITUTIONS (17)
GAUSSIAN MIXTURE MODEL (17)
INDEXES (17)
OPTIMIZATION (17)
CLUSTERING ALGORITHMS (16)
EQUATIONS (16)
SIGNAL PROCESSING (16)
ARRAYS (15)
CONFERENCES (15)
STANDARDS (15)
DATA MINING (14)
ECHO CANCELLERS (14)
MAXIMUM LIKELIHOOD ESTIMATION (14)
MFCC (14)
SPECTROGRAM (14)
ADAPTIVE FILTERS (13)
SPEAKER IDENTIFICATION (13)
ADAPTATION MODEL (12)
AUTOMATIC SPEECH RECOGNITION (12)
CEPSTRUM (12)
DECODING (12)
GAUSSIAN PROCESSES (12)
GMM (12)
HMM (12)
JOINTS (12)
NEURAL NETWORKS (12)
SEMANTICS (12)
TESTING (12)
VISUALIZATION (12)
ANALYTICAL MODELS (11)
APPROXIMATION METHODS (11)
CEPSTRAL ANALYSIS (11)
ENTROPY (11)
TRAJECTORY (11)
DEREVERBERATION (10)
FILTER BANKS (10)
IEEE TRANSACTIONS (10)
POLYNOMIALS (10)
TIME-FREQUENCY ANALYSIS (10)
WIENER FILTER (10)
FREQUENCY DOMAIN ANALYSIS (9)
HIDDEN MARKOV MODEL (9)
RELIABILITY (9)
SHAPE (9)
SPARSE REPRESENTATION (9)
SUPPORT VECTOR MACHINE (9)
APPROXIMATION ALGORITHMS (8)
COMPUTER ARCHITECTURE (8)
COST FUNCTION (8)
DIRECTION-OF-ARRIVAL ESTIMATION (8)
more

INFONA - science communication portal

Advanced search

Advanced search in people

Multitask speaker profiling for estimating age, height, weight and smoking habits from spontaneous telephone speech signals

Performance evaluation of single channel speech separation using non-negative matrix factorization

Non-invasive ambulatory monitoring of complex sEMG patterns and its potential application in the detection of vocal dysfunctions

Kernel ridge regression method applied to speech recognition problem: A novel approach

Improving the recognition of pathological voice using the discriminant HLDA transformation

Multi-level prosody and spectrum conversion for emotional speech synthesis

Voice conversion based on matrix variate Gaussian mixture model

An investigation of implementation and performance analysis of DNN based speech synthesis system

Pollination based optimization for feature reduction at feature level fusion of speech & signature biometrics

Emarati speaker identification

Near-field source extraction using speech presence probabilities for ad hoc microphone arrays

Towards online source counting in speech mixtures applying a variational EM for complex Watson mixture models

On the performance of widely linear quaternion based MVDR beamformer for an acoustic vector sensor

Multiple source localisation in the spherical harmonic domain

Single-channel speech dereverberation in acoustical environments

Speaker identification using FBCC in Malayalam language

A comparison of Multi-Layer Perceptron and Radial Basis Function neural network in the voice conversion framework

The acoustic echo cancelation using blind source separation to reduce double-talk interference

A new structure for acoustic echo cancellation in double-talk scenario using auxiliary filter

HMM-based artificial bandwidth extension supported by neural networks

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Advanced search

Advanced search in people

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options