The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Part of speech tagging has some different methods or techniques to the problem in assigning each word of a text with a part-of-speech tag. In this paper, we conducted some part-of-speech tagging techniques for Bahasa Indonesia experiments using statistical approach (Unigram, Hidden Markov Models) and Brill's tagger. In this study, we used Supervised POS Tagging approach requiring a large number of...
In this paper we describe a systematic procedure to implement two-stage based keywords spotting system (KWS). In first stage, a phonetic decoding of continuous speech is obtained using a CD-DNN-HMM model built with the Kaldi toolkit. In second stage, these results of phonetic transcriptions will serve to construct a system to search the keywords embedded in continuous speech using the classification...
Most traditional template matching based keyword recognition methods don't need training data, just rely on frame matching. However, the recognition speed is relatively slow and it can't be used in practice. The LVCSR-based method needs to convert the speech signal into text signal before recognition, which has an important impact on the final recognition performance. In this paper, we propose a method...
In this work, the Fuzzy kNN (FkNN), an alternative of the standard kNN algorithm, is used for Timit phoneme recognition. Phoneme is the smallest unit that composes speech. For this reason, if phoneme recognition is performed, it can achieve a significant word and text recognition. Thus, the main idea consists on assigning phoneme membership to the data phonemes by measuring the distance to its kNN...
In this paper, we propose an automatic segmentation system of speech into phonemes for the Arabic language. This segmentation is based on two different techniques : Hidden Markov Models (HMM) and Artificial Neural Networks (ANN). Both systems were used to classify the speech signals, extracted from ALGASD corpus (ALGerian Arabic Speech Database), into five classes : fricatives, plosives, nasals, liquids...
The act of reading Qur'an and pronouncing its sound dwells on the type of recitation. These are referring to the recitation of Warsh or the recitation of Hafss. It's very important to recognise the type of recitations, especially with the diversity and the spread of Qira'at in the world. This research presents a speech recognition system that distinguishes between the different types of the Qur'an...
Speech recognition is widely researched topic around the world. It is a process of conversion of speech to text. Many scientists and researchers are busy with doing works to increase the performance of speech recognition systems. Most of the languages in the world have speech recognizer of its own. But in our mother tongue Bangla there is no working speech recognizer. This work is little try to build...
In general for any speech processing, represented speech signals are pre-processed for some features at front end and some estimation are performed at back end. Hidden Markov Model is exclusively used for modeling time-varying vector sequences due to its simplicity. It also provides high accuracy in non-stationary environment. In this paper, HTK (Hidden Markov model Tool-Kit) toolkit is used for compiling...
Speech Recognition System is a system that listens human speech and compares the speech with words or phrases that have been prepared in advance in order to obtain the data. To optimize the model we employ fireflies to search λ = (A, B, π) that produce maximum log-likelihood. We evaluate the performance we compare the technique with PSO. Firefly algorithm outperforms PSO on constructing HMM for speech...
The article discuses aspects of incorporating shared linear transformations to implement semi-tight covariance matrices into MASPER HMM training procedure. The concern is on heteroscendic linear discriminative analysis (HLDA) applied to speech features. Next main implementation issues and necessary modifications to the standard MASPER training procedure are introduced. Finally an evaluation of the...
Design a software system on smart phone platform. The purpose of this system is providing a reasonable method to evaluate the English accent of non-native speakers, based on the phoneme recognition and fluency assessment, taking advantage of Hidden Markov Model (HMM). Meanwhile, this paper would use the neural net algorithm to combine the objective scoring and experts' scoring to increase the accuracy...
This paper explores a novel hybrid approach for classifying sequential data such as isolated spoken words. The approach combines a hidden Markov model (HMM) with a spiking neural network (SNN). The HMM, consisting of states and transitions, forms a fixed backbone with nonadaptive transition probabilities. The SNN, however, implements a Bayesian computation by using an appropriately selected spike...
In this paper, we propose a relevance vector machine (RVM) for modeling and generation of a speech feature sequence. In the conventional method, the mean parameter of the hidden Markov model (HMM) state can not consider temporal correlation among corresponding data frames. Since the RVM can be utilized to solve a nonlinear regression problem, we apply it to replace the model parameters of the state...
Automatic Speech Recognition (ASR) is the process of converting the human speech which is in the form of acoustic waveform, into text. In this paper we discussed about building an automatic speech recognition system for Telugu news. A Telugu speech database is prepared along with the transcription, dictionary. Telugu speech files are collected from the Telugu TV news channels. Most of the selected...
This paper presents an HMM-based synthesis approach for speechlaughs. The building stone of this project was the idea of the co-occurrence of smile and laughter bursts in varying proportions within amused speech utterances. A corpus with three complementary speaking styles was used to train the underlying HMM models: neutral speech, speech-smile, and finally laughter in different articulatory configurations...
We build and compare phoneme recognition systems based on Gaussian Mixture Modeling (GMM) which is a static modeling scheme and Hidden Markov Modeling (HMM) which is a Dynamic modeling scheme. Both models were built by using Stochastic pattern recognition and Acoustic phonetic schemes to recognise phonemes. Since our native language is Kannada, a rich South Indian Language, we have used 15 Kannada...
In this work, a novel approach of linear transformation on speech subspace is used to preserve the properties of speech signal under stress condition. It is assumed that, there is another subspace called as speech subspace which exist and contains the properties of speech signal under neutral and stress conditions. Therefore, speech component of stress speech is determined by linear transformation...
We build an automatic phoneme recognition system based on Hidden Markov Modeling (HMM) which is a Dynamic modeling scheme. Models were built by using Stochastic pattern recognition and Acoustic phonetic schemes to recognise phonemes. Since our native language is Kannada, a rich South Indian Language, we have used 15 Kannada phonemes to train and test these models. Since Mel — Frequency Cepstral Coefficients...
In this paper, we address an exemplar-based hidden markov model (HMM) that represents the lip motion activity using visual cues for lipreading. The discriminative visual features including the geometric shape parameters and contour-constrained spatial histogram are selected for representing each lip frame. Then, a set of exemplars associated with the HMM is learned jointly to serve as a typical representation...
Evaluating the accuracy of HMM-based and SVM-based spotters in detecting keywords and recognizing the true place of keyword occurrence shows that the HMM-based spotter detects the place of occurrence more precisely than the SVM-based spotter. On the other hand, the SVM-based spotter performs much better in detecting keywords and has higher detection rate. In this paper, we propose a rule based combination...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.