The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Speech recognition systems are either based on parametric approach or non-parametric approach. Parametric based systems such as HMMs have been the dominant technology for speech recognition in the past decade. Despite a lot of advancements and enhancements in the design of these systems: key problems such as long term temporal dependence, etc. Has not yet been solved. Recently due to availability...
A recently proposed concept for training reverberation-robust acoustic models for automatic speech recognition using pairs of clean and reverberant data is extended from word models to tied-state triphone models in this paper. The key idea of the concept, termed ICEWIND, is to use the clean data for the temporal alignment and the reverberant data for the estimation of the emission densities. Experiments...
Sequence data plays an important role in data analysis applications, such as sequence classification. One important aspect of sequence data analysis is to obtain the labeled sequence data and use a machine learning model to predict the sequence structures. Conditional Random Fields (CRF) is such a machine learning method which is popular used in sequential data analysis. This is because that CRF can...
The automatic insertion of diacritics in electronic texts is necessary for a number of languages, including French, Romanian, Croatian, Sindhi, Vietnamese, etc. When diacritics are removed from a word and the resulting string of characters is not a word, it is easy to recover the diacritics. However, sometimes the resulting string is also a word, possibly with different grammatical properties or a...
Setting out from the point of view that automatic speech recognition (ASR) ought to benefit from data in languages other than the target language, we propose a novel Kullback-Leibler (KL) divergence based method that is able to exploit multilingual information in the form of universal phoneme posterior probabilities conditioned on the acoustics. We formulate a means to train a recognizer on several...
Intrusion-detection systems (IDSs) are essential tools for the security of computer systems. Anomaly detection, which uses knowledge about normal behaviors and attempts to detect intrusions by noting significant deviations, has been paid more and more attention. In this paper, we introduce a HMM-based method for anomaly detection. The proposed method is composed of two important stages: off-line training...
We propose a recognition method based on statistics through analysis the grammatical and semantic characteristics of the Chinese organization name. This recognition method includes three elements: frequency, part of speech, word length. We use the data in mature collection as training data; separately calculate a candidate organization name's word frequency, part of speech and word length of the contribution...
Intrusion-detection systems (IDSs) are essential tools for the security of computer systems. Anomaly detection, which uses knowledge about normal behaviors and attempts to detect intrusions by noting significant deviations, has been paid more and more attention. In this paper, we introduce a novel framework for anomaly detection. In the proposed method, two widely used statistical learning method,...
To cope with the tremendous variations of writing styles encountered between different individuals, unconstrained automatic handwriting recognition systems need to be trained on large sets of labeled data. Traditionally, the training data has to be labeled manually, which is a laborious and costly process. Semi-supervised learning techniques offer methods to utilize unlabeled data, which can be obtained...
In this paper, we used Hidden Markov prediction tools to predict the state of the behavior of users in a ubiquitous home network. The state of the user's behavior presents a change of interest in the action of the user. This paper proposes a weight (WEIGHT) for the level of interest in the behavior and the strength of the relation between the behavior and interest, which is the formulation of the...
We propose an automatic method of extracting bibliographies for academic articles scanned with OCR markup. The method uses conditional random fields (CRF) for labeling serially OCR-ed text lines on an article's title page as appropriate names for bibliographic elements. Although we achieved excellent extraction accuracies for some Japanese academic journals, we needed a substantial amount of training...
Traditionally, HMM-based approaches to online Kanji handwriting recognition have relied on a hand-made dictionary, mapping characters to primitives such as strokes or substrokes. We present an unsupervised way to learn a stroke tagger from data, which we eventually use to automatically generate such a dictionary. In addition to not requiring a prior hand-made dictionary, our approach can improve the...
Adaptive boosting (AdaBoost) learning method can improve the performance of a base classifier by mining feature information in depth. But it is computationally expensive, and the base classifier without a suitable accuracy will cause over fitting. In this paper an improved Adaboost algorithm using maximum a posteriori vector quantization model (VQMAP) for speaker identification is presented. A suitable...
In this paper, we use information retrieval (IR) techniques to improve a speech recognition (ASR) system. The potential benefits include improved speed, accuracy, and scalability. Where conventional HMM-based speech recognition systems decode words directly, our IR-based system first decodes subword units. These are then mapped to a target word by the IR system. In this decoupled system, the IR serves...
Chord sequences are a compact and useful description of music, representing each beat or measure in terms of a likely distribution over individual notes without specifying the notes exactly. Transcribing music audio into chord sequences is essential for harmonic analysis, and would be an important component in content-based retrieval and indexing, but accuracy rates remain fairly low. In this paper,...
This paper presents the chunker for Tamil using Machine learning techniques. Chunking is the task of identifying and segmenting the text into syntactically correlated word groups. The chunking is done by the machine learning techniques, where the linguistical knowledge is automatically extracted from the annotated corpus. We have developed our own tagset for annotating the corpus, which is used for...
Grapheme-based acoustic modeling for Arabic is a demanding research area since high phonetic transcription accuracy is not yet solved completely. In this paper, we are studying the use of a pure grapheme-based approach using Gaussian mixture model to implicitly model missing diacritics and investigating the effect of Gaussian densities and amount of training data on speech recognition accuracy. Two...
In this paper, we propose a novel technique of using cross validation (CV) data sampling to construct an ensemble of acoustic models for conversational speech recognition. We further propose using hierarchical Gaussian mixture model (HGMM) and repartition training data to increase the ensemble size and diversity. The proposed methods are found to work well together for ensemble acoustic modeling....
Port state control (PSC) inspection is the most important mechanism to ensure world marine safe. Recently, some SVM-based risk assessment systems have been presented in the world. They estimate the risk of each candidate ship based on its generic factors and history inspection factors to select high-risk one before conducting on-board PSC inspection. However, how to improve the performance of the...
This paper establishes a speaker-independent pronunciation recognition and assessment system with 673 words for mandarin Chinese under the background of a Chinese learning system framework. The recognition part is based on HTK using HMM (Hidden Markov Models) and improved in the aspect of acoustic model. Making use of the recognition results and the log-likelihood obtained from the Viterbi coding,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.