Advanced search

Advanced search in people

From:

To:

Items from 81 to 100 out of 600 results

chapter

Modulation spectrum-based post-filter for GMM-based Voice Conversion

Shinnosuke Takamichi, Tomoki Toda, Alan W Black, Satoshi Nakamura

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 4

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

This paper addresses an over-smoothing effect in Gaussian Mixture Model (GMM)-based Voice Conversion (VC). The flexible use of the statistical approach is one of the major reason why this approach is widely applied to the speech-based systems. However, quality degradation by over-smoothed speech parameter converted is unavoidable problem of statistical modeling. One of common approaches to this over-smoothness...

chapter

Exemplar-based emotional voice conversion using non-negative matrix factorization

Ryo Aihara, Reina Ueda, Tetsuya Takiguchi, Yasuo Ariki

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 7

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

This paper presents an emotional voice conversion (VC) technology using non-negative matrix factorization, where parallel exemplars are introduced to encode the source speech signal and synthesize the target speech signal. The input source spectrum is decomposed into the source spectrum exemplars and their weights. By replacing source exemplars with target exemplars, the converted spectrum and FO...

chapter

Speech recognition in a home environment using parallel decoding with GMM-based noise modeling

Kohei Machida, Takashi Nose, Akinori Ito

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 4

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In this paper, we propose a method for noise-robust speech recognition in a home environment based on noise modeling and parallel decoding. There are three basic ideas of the proposed method. First, we model the noise signals observed in the environment using a GMM. Second, we generate multiple noise-reduced signals using the mean vectors of the GMM and decode the signals in parallel. Third, we choose...

chapter

Recognition of isolated words of esophageal speech using GMM and gradient descent RBF networks

P. Malathi, G.R. Suresh

2014 International Conference on Communication and Network Technologies > 174 - 177

2014 International Conference on Communication and Network Technologies (ICCNT)

Speech signal can be represented as a combination of acoustic parameters extracted from the speech signal. The parameter vectors are assumed to be the constituents of the speech signal over a specified duration during which it is stationary. Typical representations are Mel Frequency Cepstral Coefficients, Linear Prediction Coefficients etc. The process of isolated word recognition involves the mapping...

chapter

Introducing active learning on Text to Emotion Analyzer

Mahim-Ul Asad, Nadia Afroz, Lily Dey, Rudra Pratap Deb Nath, more

2014 17th International Conference on Computer and Information Technology (ICCIT) > 35 - 40

2014 17th International Conference on Computer and Information Technology (ICCIT)

Now-a-days, online interpersonal communications have become more preferable than face-to-face interactions. However, emotions play a significant role in online communication. Automatic extraction of emotions from the text is a hot research issue because it minimizes the communication gap and misunderstanding between users. To become emotionally more intelligent, our previous text to emotion analyzing...

chapter

Multipitch tracking with continuous correlation feature and hybrid DBNS/HMM model

Jie Lin, Gen Zhang, Bo Fu, Yujie Hao

2014 11th International Computer Conference on Wavelet Actiev Media Technology and Information Processing(ICCWAMTIP) > 218 - 221

2014 11th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)

This paper proposed a new approach used for tracking multi-pith within one mixture speech signal. In this method, we employed a novel continuous correlation feature for calculating pitch model. This feature not only represents the harmonicity but also includes the information of spectral continuity, and hence improving the accuracy of the multi-pitch estimate. A DBNs and HMM hybrid model was further...

chapter

Augmented speech production based on real-time statistical voice conversion

Tomoki Toda

2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP) > 592 - 596

2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP)

In human-to-human speech communication, various barriers are caused by some constraints, such as physical constraints causing vocal disorders and environmental constraints making it hard to produce intelligible speech. These barriers would be overcome if our speech production was augmented so that we could produce speech sounds as we want beyond these constraints. Voice conversion (VC) is a technique...

chapter

The significance-aware EPFES to estimate a memoryless preprocessor for nonlinear acoustic echo cancellation

Christian Huemmer, Christian Hofmann, Roland Maas, Walter Kellermann

2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP) > 557 - 561

2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP)

In this article, we introduce a novel approach for estimating the coefficients of a memoryless preprocessor for nonlinear acoustic echo cancellation (NL-AEC) using particle filtering. The acoustic echo path is modeled by a nonlinear-linear cascade of a memoryless preprocessor (to model the loudspeaker nonlinearities) preceding a linear finite impulse response filter (estimated by the normalized least...

chapter

Random forest algorithm for improving the performance of speech/non-speech detection

Sincy V. Thambi, K. T. Sreekumar, C. Santhosh Kumar, P. C Reghu Raj

2014 First International Conference on Computational Systems and Communications (ICCSC) > 28 - 32

2014 First International Conference on Computational Systems and Communications (ICCSC)

Speech/non-speech detection (SND) distinguishes between speech and non-speech segments in recorded audio and video documents. SND systems can help reduce the storage space required when only speech segments from the audio documents are required, for example content analysis, spoken language identification, etc. In this work, we experimented with the use of time domain, frequency domain and cepstral...

chapter

A robust speaker recognition system combining factor analysis techniques

Shaghayegh Reza, Tahereh Emami Azadi, Jahanshah Kabudian, Yaser Shekofteh

2014 21th Iranian Conference on Biomedical Engineering (ICBME) > 343 - 347

2014 21th Iranian Conference on Biomedical Engineering (ICBME)

In this paper we implement state of the art factor analysis based methods and fused their scores to gain a channel robust speaker recognition system. These two methods are joint factor analysis (JFA) and i-Vector which define low-dimensional speaker and channel dependent spaces. For score fusion we propose a simple weight computation without training step. We experiment our method on two conditions;...

chapter

Speaker based clustering using the differential energy

S. Ouamour, H. Sayoud

2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA) > 672 - 677

2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA)

A new approach of speaker clustering is presented and discussed in this paper. The main technique consists in grouping all the homogeneous speech segments obtained at the end of the segmentation process, by using the spatial information provided by the stereophonic speech. The proposed system is suitable for debates or multi-conferences for which the speakers are located at fixed positions. The new...

chapter

Comparing a High and Low-Level Deep Neural Network Implementation for Automatic Speech Recognition

Jessica Ray, Brian Thompson, Wade Shen

2014 First Workshop for High Performance Technical Computing in Dynamic Languages > 41 - 46

2014 First Workshop for High Performance Technical Computing in Dynamic Languages (HPTCDL)

The use of deep neural networks (DNNs) has improved performance in several fields including computer vision, natural language processing, and automatic speech recognition (ASR). The increased use of DNNs in recent years has been largely due to performance afforded by GPUs, as the computational cost of training large networks on a CPU is prohibitive. Many training algorithms are well-suited to the...

chapter

Sentence Similarity Based on Semantic Vector Model

Zhao Jingling, Zhang Huiyun, Cui Baojiang

2014 Ninth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing > 499 - 503

2014 Ninth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC)

Sentence similarity measures play an increasingly important role in text-related research and applications in areas such as text mining, Web page retrieval, and dialogue systems. Existing methods for computing sentence similarity have been adopted from approaches used for long text documents. These methods process sentences in a very high-dimensional space and are consequently inefficient, require...

chapter

An Exemplar-Based Hidden Markov Model with Discriminative Visual Features for Lipreading

Xin Liu, Yiu-Ming Cheung

2014 Tenth International Conference on Computational Intelligence and Security > 90 - 93

2014 Tenth International Conference on Computational Intelligence and Security (CIS)

In this paper, we address an exemplar-based hidden markov model (HMM) that represents the lip motion activity using visual cues for lipreading. The discriminative visual features including the geometric shape parameters and contour-constrained spatial histogram are selected for representing each lip frame. Then, a set of exemplars associated with the HMM is learned jointly to serve as a typical representation...

chapter

F0 prediction from linear predictive cepstral coefficient

Xueqin Chen, Yibiao Yu, Heming Zhao

2014 Sixth International Conference on Wireless Communications and Signal Processing (WCSP) > 1 - 5

2014 Sixth International Conference on Wireless Communications and Signal Processing (WCSP)

In this paper, we proposed a fundamental frequency prediction method which is used primarily in the voice conversion system. This paper establishes a Gaussian Mixture Model (GMM) to predict the fundamental frequency based on the Linear Predictive Cepstral Coefficient (LPCC). The model may be the speaker-dependent Gaussian mixture model or the speaker-independent universal background model that is...

chapter

Enhancement of reverberant speech in noisy acoustical environments

Marjan Joorabchi, Seyed Ghorshi, Ali Sarafnia

2014 Sixth International Conference on Wireless Communications and Signal Processing (WCSP) > 1 - 6

2014 Sixth International Conference on Wireless Communications and Signal Processing (WCSP)

The propagated sound waves in an indoor environment hit the surfaces of solid objects and produce reverberant speech signals. Reverberated speech signals in noisy acoustical environments cause some problems such as reducing speech intelligibility, distinguishing speakers, locating source, quality for hands-free telephony, hearing aid, etc. Adaptive filters can be applied to suppress the interfering...

chapter

Instrumentation-driven framework for validation of dataflow applications

Ilya Chukhman, Shuvra S. Bhattacharyya

2014 IEEE Workshop on Signal Processing Systems (SiPS) > 1 - 6

2014 IEEE Workshop on Signal Processing Systems (SiPS)

Dataflow modeling offers a myriad of tools in designing and optimizing signal processing systems. A designer is able to take advantage of dataflow properties to effectively tune the system in connection with functionality and different performance metrics. However, a disparity in the specification of dataflow properties and the final implementation can lead to incorrect behavior that is difficult...

chapter

Security monitoring based on joint automatic speaker recognition and blind source separation

Michele Scarpiniti, Fabio Garzia

2014 International Carnahan Conference on Security Technology (ICCST) > 1 - 6

2014 International Carnahan Conference on Security Technology (ICCST)

The aim of this paper is to introduce an enhanced approach for standard Automatic Speaker Recognition (ASR) systems in noisy environment in conjunction with a Blind Source Separation (BSS) algorithm. This latter is able to discern between interfering noise signals and the reference speech signal, hence it can be consider as a necessary preprocessing step. The main problem of the proposed approach...

chapter

SVR-based outlier detection and its application to hotel ranking

Hsien-You Hsieh, Vitaly Klyuev, Qiangfu Zhao, Shih-Hung Wu

2014 IEEE 6th International Conference on Awareness Science and Technology (iCAST) > 1 - 6

2014 IEEE 6th International Conference on Awareness Science and Technology (iCAST)

With the rapid advance in information technology, more and more information exchange platforms appear. People can freely exchange information on these platforms. However, not all information is reliable. To make correct decisions, it is necessary to detect and remove unreliable information. The main purpose of this study is to improve the reliability of hotel ranking by detecting and deleting outlier...

chapter

Artificial robot navigation based on gesture and speech recognition

Ze Lei, Zhao Hui Gan, Min Jiang, Ke Dong

Proceedings 2014 IEEE International Conference on Security, Pattern Analysis, and Cybernetics (SPAC) > 323 - 327

2014 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC)

Human-computer interaction is a hot topic in artificial intelligence. Artificial navigation is an interesting application of human-computer interaction, which control the action of the target device by speech or gestures information. The main virtue of artificial navigation is that it can control target device within a distance without any remote control device. This technology can be used in the...

Keywords:
VECTORS
SPEECH

Publication date

Set your own date range

Content availability

Available (596)
None (4)

Publication type

book (450)
article (150)

Keywords

TRAINING (165)
SPEECH RECOGNITION (154)
FEATURE EXTRACTION (144)
HIDDEN MARKOV MODELS (136)
SPEECH PROCESSING (100)
NOISE (82)
ACOUSTICS (81)
MEL FREQUENCY CEPSTRAL COEFFICIENT (66)
SPEAKER RECOGNITION (57)
ACCURACY (55)
MICROPHONES (49)
DATABASES (44)
ESTIMATION (43)
SUPPORT VECTOR MACHINES (41)
COMPUTATIONAL MODELING (40)
SPEECH ENHANCEMENT (40)
ADAPTATION MODELS (35)
NOISE MEASUREMENT (35)
KERNEL (34)
SIGNAL TO NOISE RATIO (34)
MATHEMATICAL MODEL (30)
SPEECH CODING (30)
CORRELATION (29)
DATA MODELS (28)
NOISE REDUCTION (28)
ROBUSTNESS (26)
SIGNAL PROCESSING ALGORITHMS (26)
SPEECH SYNTHESIS (26)
SPEAKER VERIFICATION (25)
SOURCE SEPARATION (23)
CONVERGENCE (22)
BLIND SOURCE SEPARATION (21)
NIST (21)
PRINCIPAL COMPONENT ANALYSIS (21)
SUPPORT VECTOR MACHINE CLASSIFICATION (21)
CONTEXT (20)
REVERBERATION (20)
TRAINING DATA (20)
VOICE CONVERSION (20)
VECTOR QUANTIZATION (19)
ALGORITHM DESIGN AND ANALYSIS (18)
ARTIFICIAL NEURAL NETWORKS (18)
COVARIANCE MATRICES (18)
DICTIONARIES (18)
EMOTION RECOGNITION (18)
TRANSFORMS (18)
COMPLEXITY THEORY (17)
COVARIANCE MATRIX (17)
EDUCATIONAL INSTITUTIONS (17)
GAUSSIAN MIXTURE MODEL (17)
INDEXES (17)
OPTIMIZATION (17)
CLUSTERING ALGORITHMS (16)
EQUATIONS (16)
SIGNAL PROCESSING (16)
ARRAYS (15)
CONFERENCES (15)
STANDARDS (15)
DATA MINING (14)
ECHO CANCELLERS (14)
MAXIMUM LIKELIHOOD ESTIMATION (14)
MFCC (14)
SPECTROGRAM (14)
ADAPTIVE FILTERS (13)
SPEAKER IDENTIFICATION (13)
ADAPTATION MODEL (12)
AUTOMATIC SPEECH RECOGNITION (12)
CEPSTRUM (12)
DECODING (12)
GAUSSIAN PROCESSES (12)
GMM (12)
HMM (12)
JOINTS (12)
NEURAL NETWORKS (12)
SEMANTICS (12)
TESTING (12)
VISUALIZATION (12)
ANALYTICAL MODELS (11)
APPROXIMATION METHODS (11)
CEPSTRAL ANALYSIS (11)
ENTROPY (11)
TRAJECTORY (11)
DEREVERBERATION (10)
FILTER BANKS (10)
IEEE TRANSACTIONS (10)
POLYNOMIALS (10)
TIME-FREQUENCY ANALYSIS (10)
WIENER FILTER (10)
FREQUENCY DOMAIN ANALYSIS (9)
HIDDEN MARKOV MODEL (9)
RELIABILITY (9)
SHAPE (9)
SPARSE REPRESENTATION (9)
SUPPORT VECTOR MACHINE (9)
APPROXIMATION ALGORITHMS (8)
COMPUTER ARCHITECTURE (8)
COST FUNCTION (8)
DIRECTION-OF-ARRIVAL ESTIMATION (8)
more

INFONA - science communication portal

Advanced search

Advanced search in people

Modulation spectrum-based post-filter for GMM-based Voice Conversion

Exemplar-based emotional voice conversion using non-negative matrix factorization

Speech recognition in a home environment using parallel decoding with GMM-based noise modeling

Recognition of isolated words of esophageal speech using GMM and gradient descent RBF networks

Introducing active learning on Text to Emotion Analyzer

Multipitch tracking with continuous correlation feature and hybrid DBNS/HMM model

Augmented speech production based on real-time statistical voice conversion

The significance-aware EPFES to estimate a memoryless preprocessor for nonlinear acoustic echo cancellation

Random forest algorithm for improving the performance of speech/non-speech detection

A robust speaker recognition system combining factor analysis techniques

Speaker based clustering using the differential energy

Comparing a High and Low-Level Deep Neural Network Implementation for Automatic Speech Recognition

Sentence Similarity Based on Semantic Vector Model

An Exemplar-Based Hidden Markov Model with Discriminative Visual Features for Lipreading

F0 prediction from linear predictive cepstral coefficient

Enhancement of reverberant speech in noisy acoustical environments

Instrumentation-driven framework for validation of dataflow applications

Security monitoring based on joint automatic speaker recognition and blind source separation

SVR-based outlier detection and its application to hotel ranking

Artificial robot navigation based on gesture and speech recognition

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Advanced search

Advanced search in people

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options