Search results

Items from 1 to 20 out of 34 results

article

Dialect Classification via Text-Independent Training and Testing for Arabic, Spanish, and Chinese

Yun Lei, J H L Hansen

IEEE Transactions on Audio, Speech, and Language Processing > 2011 > 19 > 1 > 85 - 96

Automatic dialect classification has emerged as an important area in the speech research field. Effective dialect classification is useful in developing robust speech systems, such as speech recognition and speaker identification. In this paper, two novel algorithms are proposed to improve dialect classification for text-independent spontaneous speech in Arabic and Spanish languages, along with probe...

chapter

Learning from images and speech with Non-negative Matrix Factorization enhanced by input space scaling

Joris Driesen, Hugo Van hamme, W Bastiaan Kleijn

2010 IEEE Spoken Language Technology Workshop > 1 - 6

2010 IEEE Spoken Language Technology Workshop (SLT 2010)

Computional learning from multimodal data is often done with matrix factorization techniques such as NMF (Non-negative Matrix Factorization), pLSA (Probabilistic Latent Semantic Analysis) or LDA (Latent Dirichlet Allocation). The different modalities of the input are to this end converted into features that are easily placed in a vectorized format. An inherent weakness of such a data representation...

chapter

Novel active learning sample evaluation method based on multi-level confusion networks

Wei Chen, Gang Liu, Jun Guo

2010 2nd IEEE InternationalConference on Network Infrastructure and Digital Content > 134 - 139

2010 2nd IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC 2010)

Active Learning (AL) is designed to aid the labor-intensive process of training acoustic model for speech recognition. In AL, only the most informative training samples are selected for manual annotation. Thus, how to evaluate the unlabeled samples is worth researching. In this paper, we propose a unified framework to generate confusion networks of multiple levels including character, syllable and...

chapter

Investigating word learning processes in an artificial agent

Michele Gubian, Christina Bergmann, Lou Boves

2010 IEEE 9th International Conference on Development and Learning > 178 - 184

2010 IEEE 9th International Conference on Development and Learning (ICDL 2010)

Researchers in human language processing and acquisition are making an increasing use of computational models. Computer simulations provide a valuable platform to reproduce hypothesised learning mechanisms that are otherwise very difficult, if not impossible, to verify on human subjects. However, computational models come with problems and risks. It is difficult to (automatically) extract essential...

chapter

Audio-Visual Co-Training for Vehicle Classification

M Godec, C Leistner, H Bischof, A Starzacher, more

2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance > 586 - 592

7th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS 2010)

In this paper, we introduce a fully autonomous vehicle classification system that continuously learns from largeamounts of unlabeled data. For that purpose, we proposea novel on-line co-training method based on visual and acoustic information. Our system does not need complicated microphone arrays or video calibration and automatically adapts to specific traffic scenes. These specialized detectors...

chapter

North Atlantic Right Whale acoustic signal processing: Part I. comparison of machine learning recognition algorithms

Peter J Dugan, Aaron N Rice, Ildar R Urazghildiiev, Christopher W Clark

2010 IEEE Long Island Systems, Applications and Technology Conference > 1 - 6

2010 IEEE Long Island Systems, Applications and Technology Conference (LISAT 2010)

This paper compares three different approaches currently used in recognizing contact calls made from the North Atlantic Right Whale (NRW), Eubalaena glacialis. We present two new approaches consisting of machine learning algorithms based on artificial neural networks (NET) and the classification and regression tree classifiers (CART), and compare their performance with earlier work that employs multi-Stage...

chapter

North Atlantic right whale acoustic signal processing: Part II. improved decision architecture for auto-detection using multi-classifier combination methodology

Peter J Dugan, Aaron N Rice, Ildar R Urazghildiiev, Christopher W Clark

2010 IEEE Long Island Systems, Applications and Technology Conference > 1 - 6

2010 IEEE Long Island Systems, Applications and Technology Conference (LISAT 2010)

Autonomous signal detection of the North Atlantic right whale (NRW), Eubalaena glacialis, is becoming an important factor in monitoring and conservation for this highly endangered species. Both online and offline systems exist to help study and protect animals within this population. In both cases auto-detection of species-specific calls plays a vital role in localizing individual animal by searching...

chapter

GMM-HMM acoustic model training by a two level procedure with Gaussian components determined by automatic model selection

Dan Su, Xihong Wu, Lei Xu

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4890 - 4893

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

This paper investigates the Bayesian Ying-Yang (BYY) learning for speech recognition via Gaussian mixture models (GMMs) based Hidden Markov models (HMMs). A two level procedure is proposed with the hidden Markov level trained still under the maximum likelihood principle by the Baum-Welch algorithm but with the GMMs level trained under the BYY best harmony. We proposed a new batch way EM-like Ying-Yang...

chapter

Leveraging evaluation metric-related training criteria for speech summarization

Shih-Hsiang Lin, Yu-Mei Chang, Jia-Wen Liu, Berlin Chen

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 5314 - 5317

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

Many of the existing machine-learning approaches to speech summarization cast important sentence selection as a two-class classification problem and have shown empirical success for a wide variety of summarization tasks. However, the imbalanced-data problem sometimes results in a trained speech summarizer with unsatisfactory performance. On the other hand, training the summarizer by improving the...

chapter

Approaches to automatic lexicon learning with limited training examples

N Goel, S Thomas, M Agarwal, P Akyazi, more

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 5094 - 5097

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

Preparation of a lexicon for speech recognition systems can be a significant effort in languages where the written form is not exactly phonetic. On the other hand, in languages where the written form is quite phonetic, some common words are often mispronounced. In this paper, we use a combination of lexicon learning techniques to explore whether a lexicon can be learned when only a small lexicon is...

chapter

Recent improvements to the Cambridge Arabic Speech-to-Text systems

M Tomalin, F Diehl, M J F Gales, J Park, more

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4382 - 4385

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

This paper describes recent improvements to the Cambridge Arabic Large Vocabulary Continuous Speech Recognition (LVSCR) Speech-to-Text (STT) system. It is shown that Multi-Layer Perceptron (MLP) features trained on phonetic targets can improve the performance of both phonemic and graphemic systems. Also, a morphological decomposition scheme is extended from the graphemic domain to the phonetic domain,...

chapter

Speech modeling based on committee-based active learning

Y Hamanaka, K Shinoda, S Furui, T Emori, more

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4350 - 4353

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

We propose a committee-based active learning method for large vocabulary continuous speech recognition. In this approach, multiple recognizers are prepared beforehand, and the recognition results obtained from them are used for selecting utterances. Here, a progressive search method is used for aligning sentences, and voting entropy is used as a measure for selecting utterances. We apply our method...

chapter

Weakly supervised learning with decision trees applied to fisheries acoustics.

R Lefort, R Fablet, J Boucher

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 2254 - 2257

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

This paper addresses the training of classification trees for weakly labelled data. We call “weakly labelled data”, a training set such as the prior labelling information provided refers to vector that indicates the probabilities for instances to belong to each class. Classification tree typically deals with hard labelled data, in this paper a new procedure is suggested in order to train a tree from...

chapter

Combining image-level and object-level inference for weakly supervised object recognition. Application to fisheries acoustics

R. Lefort, R. Fablet, I. Karoui, J.-M. Boucher

2009 16th IEEE International Conference on Image Processing (ICIP) > 293 - 296

2009 16th IEEE International Conference on Image Processing (ICIP 2009)

This paper addresses weakly supervised object recognition. We show how the combination of an image-level inference, in terms of image-level object class priors, can lead to better training of object recognition models. Stated within a probabilistic setting, the proposed approach is applied to fisheries acoustics and fish school recognition.

chapter

Acoustic Fault Identification of Underwater Vehicles Based on NSOM-PNN

Ruipeng Luan, Kerong Ben, Lilin Cui

2009 International Conference on Artificial Intelligence and Computational Intelligence > 2 > 384 - 388

2009 International Conference on Artificial Intelligence and Computational Intelligence (AICI 2009)

Aiming at the requirement of class incremental learning in acoustic fault identification research, a network model using a novel Self-organizing map--negative self-organizing map (NSOM) and probabilistic neural network (PNN) is proposed. The experiment of acoustic fault identification of underwater vehicle shows that the proposed network has better capability of class incremental learning than traditional...

chapter

Effect of gaussian densities and amount of training data on grapheme-based acoustic modeling for Arabic

M. Elmahdy, R. Gruhn, W. Minker, S. Abdennadher

2009 International Conference on Natural Language Processing and Knowledge Engineering > 1 - 5

2009 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE)

Grapheme-based acoustic modeling for Arabic is a demanding research area since high phonetic transcription accuracy is not yet solved completely. In this paper, we are studying the use of a pure grapheme-based approach using Gaussian mixture model to implicitly model missing diacritics and investigating the effect of Gaussian densities and amount of training data on speech recognition accuracy. Two...

chapter

Mandarin pitch accent prediction using hierarchical model based ensemble machine learning

Chongjia Ni, Wenju Liu, Bo Xu

2009 IEEE Youth Conference on Information, Computing and Telecommunication > 327 - 330

2009 IEEE Youth Conference on Information, Computing and Telecommunication (YC-ICT 2009)

In this study, we combine the Mandarin characteristics with Mandarin acoustic attribute and text information and use hierarchical model based ensemble machine learning to predict Mandarin pitch accent. Our model could make the best of advantages of prosody hierarchical structure and ensemble machine learning. When comparing our model with classification and regression tree (CART), support vector machine...

chapter

Weakly supervised classification with bagging in fisheries acoustics

R. Lefort, R. Fablet, J.-M. Boucher

2009 IEEE International Symposium on Intelligent Signal Processing > 143 - 146

6th IEEE International Symposium on Intelligent Signal Processing. WISP 2009

Statistical training allows the establishment of a probabilistic classification model. In the supervised case, the model is assessed from a labelled dataset, i.e. each observed data has a label. In the weakly-supervised case, the label is not exactly known. In our instance, the probability to associate the observation to the different classes is known. Thus, labels for the data are a probability vector...

chapter

Modified MPE/MMI in a transducer-based framework

G. Heigold, R. Schluter, H. Ney

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 3749 - 3752

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

In this paper we show how common training criteria like for example MPE or MMI can be extended to incorporate a margin term. In addition, a transducer-based training implementation is presented, which covers a large variety of discriminative training criteria for ASR, including the standard MMI, MPE, and MCE criteria, as well as the modifications to these criteria presented here. The modified criteria...

chapter

Perturbation and pitch normalization as enhancements to speaker recognition

A. Lawson, M. Linderman, M. Leonard, A. Stauffer, more

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4533 - 4536

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

This study proposes an approach to improving speaker recognition through the process of minute vocal tract length perturbation of training files, coupled with pitch normalization for both train and test data. The notion of perturbation as a method for improving the robustness of training data for supervised classification is taken from the field of optical character recognition, where distorting characters...

Keywords:
ACOUSTICS
LEARNING (ARTIFICIAL INTELLIGENCE)

Publication date

Set your own date range

Publication type

book (31)
article (3)

Keywords

SPEECH RECOGNITION (17)
SPEECH (13)
ACOUSTIC SIGNAL PROCESSING (12)
HIDDEN MARKOV MODELS (10)
PROBABILITY (7)
DATA MODELS (6)
LATTICES (6)
MACHINE LEARNING (6)
AQUACULTURE (5)
CLASSIFICATION ALGORITHMS (5)
FEATURE EXTRACTION (5)
NATURAL LANGUAGE PROCESSING (5)
SUPPORT VECTOR MACHINES (5)
TRAINING DATA (5)
ACCURACY (4)
ACTIVE LEARNING (4)
ARTIFICIAL NEURAL NETWORKS (4)
COMPUTATIONAL MODELING (4)
EDUCATIONAL INSTITUTIONS (4)
FISHERIES ACOUSTICS (4)
PATTERN CLASSIFICATION (4)
SPEECH PROCESSING (4)
ADAPTATION MODEL (3)
AUTOMATIC SPEECH RECOGNITION (3)
BAYES METHODS (3)
DICTIONARIES (3)
MAXIMUM LIKELIHOOD ESTIMATION (3)
OBJECT RECOGNITION (3)
SPEECH RECOGNITION SYSTEMS (3)
SUPERVISED CLASSIFICATION (3)
ACOUSTIC MODEL (2)
ACOUSTIC MODELING (2)
ACOUSTIC MONITORING (2)
ARTIFICIAL NEURAL NETWORK (2)
AUTOMATED DETECTION (2)
BAYESIAN METHODS (2)
CONFIDENCE MEASURE (2)
CONFUSION NETWORK (2)
DATABASES (2)
DECODING (2)
ENTROPY (2)
ERROR ANALYSIS (2)
FEATURE SELECTION (2)
GAUSSIAN MIXTURE MODEL (2)
GAUSSIAN PROCESSES (2)
HISTOGRAMS (2)
IMAGE CLASSIFICATION (2)
KERNEL (2)
KULLBACK-LEIBLER DIVERGENCE (2)
LEARNING SYSTEMS (2)
MATHEMATICAL MODEL (2)
MATRIX DECOMPOSITION (2)
MULTILAYER PERCEPTRON (2)
MULTILAYER PERCEPTRONS (2)
NATURAL LANGUAGES (2)
NEURAL NETS (2)
NOISE (2)
OPTIMIZATION (2)
PATTERN RECOGNITION (2)
PEDIATRICS (2)
RIGHT WHALE (2)
ROBUSTNESS (2)
SIGNAL CLASSIFICATION (2)
SIGNAL PROCESSING (2)
SPEECH SYNTHESIS (2)
SUPERVISED LEARNING (2)
SUPPORT VECTOR MACHINE (2)
VOCABULARY (2)
WEAKLY SUPERVISED LEARNING (2)
WHALES (2)
WORD ERROR RATE (2)
5.2 GROUNDING OF KNOWLEDGE AND REPRESENTATIONS (1)
6.1 LANGUAGE LEARNING (1)
6.8 STATISTICAL LEARNING (1)
ACORNS PROJECT (1)
ACOUSTIC ARRAYS (1)
ACOUSTIC DISTORTION (1)
ACOUSTIC FAULT IDENTIFICATION (1)
ACOUSTIC IMAGE (1)
ACOUSTIC MODEL TRAINING (1)
ACOUSTIC MODELLING (1)
ACOUSTIC MODELS (1)
ACOUSTIC PERTURBATION (1)
ACOUSTIC RECORDINGS (1)
ACOUSTIC SIGNAL (1)
ACOUSTIC SIGNAL DETECTION (1)
ACOUSTIC SIGNALS (1)
ACOUSTIC WAVES (1)
ACTIVE LEARNING SAMPLE EVALUATION METHOD (1)
ADABOOST (1)
ADAPTIVE LEARNING RATE BACK-PROPAGATION (1)
AGRICULTURAL PRODUCTS (1)
AGRICULTURE (1)
ANALYTICAL MODELS (1)
ANN (1)
ARABIC (1)
ARABIC DIALECTS (1)
more

INFONA - science communication portal

Search results

Dialect Classification via Text-Independent Training and Testing for Arabic, Spanish, and Chinese

Learning from images and speech with Non-negative Matrix Factorization enhanced by input space scaling

Novel active learning sample evaluation method based on multi-level confusion networks

Investigating word learning processes in an artificial agent

Audio-Visual Co-Training for Vehicle Classification

North Atlantic Right Whale acoustic signal processing: Part I. comparison of machine learning recognition algorithms

North Atlantic right whale acoustic signal processing: Part II. improved decision architecture for auto-detection using multi-classifier combination methodology

GMM-HMM acoustic model training by a two level procedure with Gaussian components determined by automatic model selection

Leveraging evaluation metric-related training criteria for speech summarization

Approaches to automatic lexicon learning with limited training examples

Recent improvements to the Cambridge Arabic Speech-to-Text systems

Speech modeling based on committee-based active learning

Weakly supervised learning with decision trees applied to fisheries acoustics.

Combining image-level and object-level inference for weakly supervised object recognition. Application to fisheries acoustics

Acoustic Fault Identification of Underwater Vehicles Based on NSOM-PNN

Effect of gaussian densities and amount of training data on grapheme-based acoustic modeling for Arabic

Mandarin pitch accent prediction using hierarchical model based ensemble machine learning

Weakly supervised classification with bagging in fisheries acoustics

Modified MPE/MMI in a transducer-based framework

Perturbation and pitch normalization as enhancements to speaker recognition

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options