The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Deep Neural Networks (DNN) are the dominant technique widely used in English and Chinese speech recognition currently. However, Tibetan speech recognition research starts late and mainly uses Hidden Markov Model (HMM). In this paper, We show a better method of replacing Gaussian Mixture Models (GMM) by DNN to Tibetan Lhasa dialect speech recognition system. The system contains seven layers of features...
There are many challenges in single-channel multi-person mixed speech separation, such as modeling the temporal continuity of the speech signals and improving the frame separation performance simultaneously. In this paper, a separation method based on Deep Clustering with local optimization by the improved Non-Negative Matrix Factorization (NMF) combined with Factorial Conditional Random Fields (FCRF)...
The paper proposes a classification model for human behavioral patterns recognition in which the decisions are provided based on several Support Vector Machines classifiers within a multi-level decision structure. SVMs are suitable for applications in which the input data feature spaces are very large, involving many features. The human behavior recognition is a relevant example of such application...
Predicting the locations of Response Elements (RE) has received considerable attention in the field of gene sequence analysis and bioinformatics. Protein53 (p53) has a prominent role in the cell cycle and cancer prevention; it functions as a transcription factor and binds with p53 REs in the DNA. The identification of p53 response elements enlightens the unknown functions and characteristics of p53...
Due to the variability of writing styles and to other problems related to the nature of Arabic scripts, the recognition of Arabic handwriting is still awaiting accurate results. Segmentation of Arabic handwritten words into graphemes poses a major challenge in Arabic handwriting recognition and is highly error prone. In this paper, we adopt the holistic approach which handles the whole word image...
The use of deep neural networks (DNNs) for feature extraction and Gaussian mixture models (GMMs) for acoustic modelling is often termed a tandem system configuration and can be viewed as a Gaussian mixture density neural network (MDNN). Compared to the direct use of DNN output probabilities in the acoustic model, the tandem approach suffers from a major weakness in that the feature extraction stage...
Long short-term memory (LSTM) recurrent neural network based language models are known to improve speech recognition performance. However, significant effort is required to optimize network structures and training configurations. In this study, we automate the development process using evolutionary algorithms. In particular, we apply the covariance matrix adaptation-evolution strategy (CMA-ES), which...
Computational analysis of transcription factor binding site (TFBS) is one of the most challenging topics in bioinformatics. A set of TFBS sequences is a type of multiple sequence alignment (MSA). Thus, the hidden Markov model (HMM), as a powerful tool to model MSA, has been extensively applied in TFBS analysis. However, with the sizes of TFBS problems, training HMM in a deterministic way is computationally...
This paper proposes an efficient algorithm on the training of hidden conditional random fields (HCRFs) for large-scale speaker recognition in which a speaker identification task with around 1000 speakers is investigated. HCRFs are a type of direct models in pattern recognition and thus iterative procedures are usually required to estimate the model parameters. The key method in this paper is to perform...
Deep neural network (DNN) is trained according to a mini-batch optimization based on the stochastic gradient descent algorithm. Such a stochastic learning suffers from instability in parameter updating and may easily trap into local optimum. This study deals with the stability of stochastic learning by reducing the variance of gradients in optimization procedure. We upgrade the optimization from the...
Deep learning is proven to outperform other machine learning methods in numerous research fields. However, previous approaches, like multispace probability distribution hidden Markov models still surpass deep learning methods in the prediction accuracy of speech fundamental frequency (F0), inter alia, due to its discontinuous behavior. The current research focuses on the application of feedforward...
The paper proposes an innovative supervised learning method for human behavioral recognition in which the behavioral patterns are classified according to the classes importance. A detector classifier is trained to recognize the human behavioral patterns belonging to the most important class. The optimization is performed by fixing the classifier operating point to provide the appropriate performance...
Lifetime prediction of a technical system plays a significant role also with respect to the avoidance of breakdowns. The first part of this contribution is a brief review of lifetime models followed by an introduction of a new parametric lifetime model. Experimental data for the lifetime model training and evaluation are taken from a tribological system describing a wear process. The main focus of...
Linear Discriminant Analysis (LDA) has been applied successfully to speech recognition tasks, improving accuracy and robustness against some types of noise. However, it is well known that LDA suffers from some weaknesses if the distributions are not unimodal or when the mean of the distributions are shared. In this paper, we propose to take advantage of the nonlinear discriminant properties of the...
In this paper we present an investigation of sequence-discriminative training of deep neural networks for automatic speech recognition. We evaluate different sequence-discriminative training criteria (MMI and MPE) and optimization algorithms (including SGD and Rprop) using the RASR toolkit. Further, we compare the training of the whole network with that of the output layer only. Technical details...
This paper proposes a statistical methodology based on evolving Fuzzy-rule-based (FRB) classifiers to develop dialog managers for spoken dialog systems. The dialog managers developed by means of our proposal select the next system action by considering a set of dynamic rules that are automatically obtained by means of the application of the FRB classification process. Our approach has the main advantage...
A model is proposed to developed a Indigenous language (Galo) sentence's pitch-contour with sentence-wide optimization, called the sentence pitch-contour using HMM(Hidden Markov Model) & VQ (vector quantization). To develop a sentence pitch-contour (SPC-HMM), each training sentence are normalized for the pitch-contours of the syllables. Our model is effective for pitch height normalization...
Support vector machines (SVM) were originally developed for binary classification and extended for multi-class classification. Due to their powerfulness and adaptation to hard classification problems, we have chosen them for automatic speech recognition (ASR). The aim of this paper is to investigate the use of SVM multi-class classification coupled with HMM for TIMIT phones. SVM requires that all...
In this paper we present a method for identification of temporal patterns that are predictive of events in a dynamic data system. The proposed new MRPS-HMM method applies a hybrid model using Reconstructed Phase Space (RPS) and stochastic state estimation via Hidden Markov Model (HMM) to search predictive patterns. This method constructs a multivariate phase space by embedding each data sequence with...
The design processes and methods of PHM system based on HMM have been investigated. HMM has some advantage in terms of dealing with small sample size and high discerning accuracy. The rationality of sensors set which is based on the hidden Markov model has been evaluated from quantitative point of view. Then the evaluating method of different sensor sets based on HMM has been put forward. At last,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.