The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this work we focus on Emarati speaker identification systems in neutral talking environments based on each of Vector Quantization (VQ), Gaussian Mixture Models (GMMs), and Hidden Markov Models (HMMs) as classifiers. These systems have been tested on our collected Emarati speech database which is composed of 25 male and 25 female Emarati speakers using Mel-Frequency Cepstral Coefficients (MFCCs)...
In this paper, we proposed the facial expression recognition framework for estimating the emotional states by measuring the variations of facial activities under the context of human-machine interaction. The state of the art researches frequently estimate the facial expression by using a still image, which often mistaken recognize the input image as another emotion expression. In this paper, we deal...
In the process of medicine information extraction, there are many Named Entity (NE) need to be recognized But currently the research on identification of NE in the field of medicine, such as physician, hospital, disease, medicine NE etc. is rarely. So in this paper we present an new approach for Named Entity Recognition (NER) in the field of medicine based on Bootstrapping method This method primarily...
In this paper we evaluate the influence of the selection of key points and the associated features in the performance of word spotting processes. In general, features can be extracted from a number of characteristic points like corners, contours, skeletons, maxima, minima, crossings, etc. A number of descriptors exist in the literature using different interest point detectors. But the intrinsic variability...
This paper proposes a system for text-independent writer identification based on Arabic handwriting using only 21 features. Gaussian Mixture Models (GMMs) are used as the core of the system. GMMs provide a powerful representation of the distribution of features extracted using a fixed-length sliding window from the text lines and words of a writer. For each writer a GMM is built and trained using...
Writer identification from musical scores is a challenging task. A few pieces of work on writer identification in musical sheets have been published in the literature but to the best of our knowledge all these work were performed after removal of staff lines from the musical scores. In this paper we propose a symbol-independent writer identification framework using HMM in music score without removing...
This paper proposes a novel bleed-through removal technique based on learning a color channel that is optimized so that the foreground text is enhanced while at the same time the variability of the background (including the bleed-through) is diminished. The technique is intended to be part of an interactive transcription system in which the objective is obtaining high quality transcriptions with the...
This paper proposes a constrained AdaBoost algorithm for utilizing global features in a dynamic time warping (DTW) framework. Global features are defined as a spatial relationship between temporally-distant points of a temporal pattern and are useful to represent global structure of the pattern. An example is the spatial relationship between the first and the last points of a handwritten pattern of...
A semiautomatic iterative process for the detection of text baselines in historical handwritten document images is presented. It relies on the use of Hidden Markov Models (HMM) to provide initial text baselines hypotheses, followed by user review in order to produce ground-truth quality results. Using the set of revised baselines as ground truth, the HMM's are re-trained before processing the next...
In telephony applications, artificial bandwidth extension (ABE) can be applied to narrowband (NB) calls for speech quality and intelligibility enhancement. However, high-band extension is challenging due to insufficient mutual information between the lower and upper frequency band in speech. Estimation errors particularly of fricatives /s, z/ are the consequence leading to annoying artifacts, such...
A motion prediction method using Gaussian Mixture Models (GMM) is applied to a kendo agent (Kendo is a traditional Japanese martial art). Human player motion is measured by a motion capture system, using markers attached to each of the player's joints. Measurement information is converted to a state vector with Euler angles to indicate orientation of the sword and orientation of each part of the player's...
Transmission of data in infrastructureless mobile ad hoc network (MANET) is performed by cooperation among nodes which requires devising efficient routing schemes. When there is a possibility of non-cooperative and selfish behaviors of nodes in the network, in order to improve the performance, both trust and energy consumption associated with intermediate nodes should be jointly taken into account...
Missing data theory has recently been used as a solution to noise robustness issue in Automatic Speech Recognition (ASR). Missing components of spectrogram can either be reconstructed, as carried out in Spectral Imputation, or simply ignored, as done in classifier modification. Most of the research has been focused on imputation because of the problems associated with classifier modification approaches...
Many researches have done to develop speech recognition systems in the past decades. However, their performance in speaker variabilities lags behind that of human recognition system. In order to solve this problem, speaker adaptation methods have proposed. These methods adapt either the acoustic model parameters or the input features of the speech recognition systems to improve their performance....
Due to information revolution, huge amount of data is available over internet but retrieving correct and relevant data is not an easy task. The information retrieval from search engines is still far greater than that a user can handle and manage. Thus there is need of presenting the information in an abstract way so that one can easily infer the meaning without reading the whole document. In this...
The World Wide Web evolved so rapidly that it is no longer considered a luxury, but a necessity. That is why currently the most popular infection vectors used by cyber criminals are either web pages or commonly used documents (such as pdf files). In both of these cases, the malicious actions performed are written in Java Script. Because of this, Java Script has become the preferred language for spreading...
The maximum likelihood linear regression (MLLR) technique is a well-known approach to parameter adaptation in hidden Markov model (HMM)-based systems. In this paper, we propose the maximum penalized likelihood kernel regression (MPLKR) approach as a novel adaptation technique for HMM-based speech synthesis. The proposed algorithm performs a nonlinear regression between the mean vector of the base...
Gesture recognition is increasingly remarkable in the field of HCI, since hand motions and gestures enable users to interact with computers in more natural ways. This paper focuses on two-hand gesture recognition and proposes a Kinect-based method which specially takes a certain static gesture as a start and end mark of a dynamic gesture. Furthermore, we use an innovative way to extract the feature...
Thanks to the use of lexical and syntactic information, Word Graphs (WG) have shown to provide a competitive Precision-Recall performance, along with fast lookup times, in comparison to other techniques used for Key-Word Spotting (KWS) in handwritten text images. However, a problem of WG approaches is that they assign a null score to any keyword that was not part of the training data, i.e. Out-of-Vocabulary...
A new acoustic model based on deep neural network (DNN) has been introduced recently and outperforms the conventional Gaussian mixture model (GMM) in speech recognition on several tasks. However, the number of parameters required by a DNN model is much larger than that of its counterpart. The excessive cost of computation cumbers the implementation of DNN in many scenarios. In this paper, a DNN-based...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.