The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This document belongs to the implementation of a system of recognition of the bases on the Hidden Models of Markov HMM. Due to a system of recognition, we will use the techniques of parameterization which handles the mechanisms of the human ear Mel Frequency cepstral coefficient MFCC and Perceptual linear prediction PLP starting from the database TIMIT. We also used two indices Jitter and Shimmer...
Clinical pathways are popular healthcare management tools to standardise care and ensure quality. Measuring pathway conformance and analysing variances gives valuable feedback in the context of care improvement trajectories. The Business Process Model and Notation (BPMN) language and Task-Time matrices are popular ways to model clinical pathways. A key step in variance analysis involves the computation...
Heart disease is one of the important cause of death. In this study, we used ECG data obtained from MIT-BIH database to classify arrhythmias. We select 5 classes, normal beat (N), right bundle branch block (RBBB), left bundle branch block (LBBB), atrial premature contraction (APC) and ventricular premature contraction (VPC). We applied k-means based Polyhedral Conic Functions (k-means PCF) algorithm...
Knowledge discovery is the process of extracting useful or hidden patterns in data. With the growth of data in a structural form, such as social networks, extracting knowledge from data represented in the form of graphs is an emerging technique. In this paper, we demonstrate how "skills" data from resumes (i.e., what skills an applicant possesses) can be modelled into a type of graph data...
Handwritten word recognition is a tough task, mixing image and natural language processing. Recently new recurrent neural networks with LSTM cells allowed significant improvements in this field. These networks are generally coupled with lexical and linguistic knowledge in order to correct character misrecognitions, namely using a lexicon driven decoding. Yet the high performances of LSTM networks...
The psychological disorders are generally appears in society. The prediction of such disorder is necessary in day to day life. Electroencephalogram (EEG) signal is neuronal activity of brain. A brain signal plays an important role for human disposition detection. EEG signals are non-linear in nature. In this paper, EEG signals are classified into four emotional states such as happy, angry, cry and...
The gesture recognition has raised attention in computer vision owing to its many applications. However, video-based large-scale gesture recognition still faces many challenges, since many factors like background may disturb the accuracy. To achieve gesture recognition with large-scale videos, we propose a method based on RGB-D data. To learn gesture details better, the inputs are expanded into 32-frame...
In this paper, a study is conducted on combining analytical and holistic strategies for handwriting recognition. Even though the big majority of the recent high recognition rate systems adopts analytical strategies, physiological scientists suggest that the holistic strategy is the key for realizing near-human performance. In what we believe is a fresh perspective on handwriting recognition, combining...
State-of-the-art approaches on text-to-speech (TTS) synthesis like unit selection and HMM synthesis are data-driven. Therefore, they use a prerecorded speech corpus of natural speech to build a voice. This paper investigates the influence of the size of the speech corpus on five different perceptual quality dimensions. Six German unit selection voices were created based on subsets of different sizes...
For acoustic modeling, the use of DNN has become popular due to its superior performance improvements observed in many automatic speech recognition (ASR) tasks. Typically, DNNs with deep (many layers) and wide (many hidden units per layer) architectures are chosen in order to achieve good gains. An issue with such approaches is that there is an explosion in the number of learnable parameters. Thus,...
In this paper, we analyze the feasibility of using single well-resourced language - English - as a source language for multilingual techniques in context of Stacked Bottle-Neck tandem system. The effect of amount of data and number of tied-states in the source language on performance of ported system is evaluated together with different porting strategies. Generally, increasing data amount and level-of-detail...
Automatic speech recognition (ASR) of code-switching speech requires careful handling of unexpected language switches that may occur in a single utterance. In this paper, we investigate the feasibility of using multilingually trained deep neural networks (DNN) for the ASR of Frisian speech containing code-switches to Dutch with the aim of building a robust recognizer that can handle this phenomenon...
Automatic semantic annotation of data from databases or the web is an important pre-process for data cleansing and record linkage. It can be used to resolve the problem of imperfect field alignment in a database or identify comparable fields for matching records from multiple sources. The annotation process is not trivial because data values may be noisy, such as abbreviations, variations or misspellings...
We propose to detect mispronunciations in a language learners speech via a discriminatively trained DNN in the phonetic space. The posterior probabilities of “senones” populated in a decision tree are trained and predicted speaker independently. Acoustic features of each input segment (with preceding and succeeding contexts of several frames) are mapped unto the whole set of senones in their corresponding...
The man-machine dialogue is too difficult to put in place but it remains an important issue for help people with speech problems. In this objective, we have oriented our work around a control interface including multiple communication tools: Videoconferencing between a doctor and patient thanks to a camera via Skype where a communication can be established and by which the doctor can make an initial...
Real handwriting authentication systems need a robust writer identification over a long time period. The paper analyzes signature sessions of the ATV-Signature Long Term Database (ATV-SLT DB). The database contains 6 sessions generated by 27 users over 15 month. The quality change of the verification results over a period of 15 month is examined. 64static and dynamic biometric features from the ATV-SLT...
This paper presents a text-dependent speaker verification using Mel-Frequency Cepstral Coefficients (MFCC) and Support Vector Machine (SVM). Mel-Frequency Cepstral Coefficients technique has been used to extract the characteristic from the recorded voice spoken by the user and SVM is used to classify the all models of the speakers and impostors. A Malay spoken digit database is utilized for the training...
This work focuses on Emirati speaker verification systems in neutral talking environments based on each of First-Order Hidden Markov Models (HMMls), Second-Order Hidden Markov Models (HMM2s), and Third-Order Hidden Markov Models (HMM3s) as classifiers. These systems have been evaluated on our collected Emirati speech database which is comprised of 25 male and 25 female Emirati speakers using Mel-Frequency...
In order to better reuse of motion capture data, complex motion sequences should be segmented into distinct behaviors. As we move toward collecting longer motion sequences, automatic behavior segmentation techniques are becoming important. In this paper, we proposed a method for automated segmentation motion capture data into distinct behaviors. We employ Gaussian Mixture Model (GMM) to model the...
Building synthetic child voices is considered a difficult task due to the challenges associated with data collection. As a result, speaker adaptation in conjunction with Hidden Markov Model (HMM)-based synthesis has become prevalent in this domain because the approach caters for limited amounts of data. An initial average voice model is trained using data from multiple speakers and adapted to resemble...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.