The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The goal of this article is to analyse how the length of utterances affects performance of an automatic speech recognizer (ASR). Benchmarks of an ASR system were performed for utterances of various lengths on English and Czech corpora. Then the observed phenomena are tried to be explained theoretically. Eventually, results are summarized and some conclusions drawn.
The Bag of Words (BOW) method with spatio-temporal interest points has achieved great performance in human action recognition. However the traditional BOW methods based on vector quantization (VQ) suffer serious quantization error and lose masses of information. There are two main reasons leading these: the first is the codebook obtained by k-means has no obvious visual interpretation and second,...
This paper presents a unified framework for recognizing and scoring dance motion using 2-layer classifier so that computation complexity is distributed into two layers. This research examines the performance of sliding window, hidden Markov Model (HMM) and conditional random field (CRF) as the first layer classifier to segment the input video into a sequence of motion primitive label. The second layer...
We survey evidence — orthographic distributional phonological and psycholinguistic — in favor of a model of Arabic speech sounds based on the CV unit and extensive use of the silent sukuun vowel. We then construct a small-vocabulary multi-speaker CV HMM similar to the phonemic HMMs based on tied triphones that are widely used in speech recognizers for English and other European languages. Using experimental...
Connected sensors are on the march to become pervasive. While they are often deployed for a single purpose it is worth to take a second look. In this study, we show that the widespread Netatmo weather station which is intended to monitor and improve indoor climate can be used to estimate binary occupancy of individual rooms. We collected data from 11 rooms in 3 apartments including binary occupancy...
Emotion recognition from speech plays an important role in developing affective and intelligent Human Computer Interaction. The goal of this work is to build an Automatic Emotion Variation Detection (AEVD) system to determine each emotional salient segment in continuous speech. We focus on emotion detection in angry-neutral speech, which is common in recent studies of AEVD. This study proposes a novel...
The process of assigning part of speech for every word in a given sentence according to the context is called as part of speech tagging. Part of speech tagging (POS tagging) plays an important role in the area of natural language processing (NLP) including applications such as speech recognition, speech synthesis, natural language parsing, information retrieval, multi words term extraction, word sense...
This paper describes the ANWRESH (Anncestry Word Recognition from Segmented Historical Documents)competition held at the 14th International Conference on Frontiers in Handwriting Recognition (ICFHR-2014). This competition uses the ANWRESH dataset selected from the 1930 US Census collection including word bounding box and field lexicon data. Five teams submitted systems for recognizing six fields including...
This paper presents a novel verification approach towards improvement of handwriting recognition systems using a word hypotheses rescoring scheme by Deep Belief Networks (DBNs). A recurrent neural network based sequential text recognition system is used at first to provide the N-best recognition hypotheses of word images. Word hypotheses are aligned with the word image to obtain the character boundaries...
Writer identification from musical scores is a challenging task. A few pieces of work on writer identification in musical sheets have been published in the literature but to the best of our knowledge all these work were performed after removal of staff lines from the musical scores. In this paper we propose a symbol-independent writer identification framework using HMM in music score without removing...
A semiautomatic iterative process for the detection of text baselines in historical handwritten document images is presented. It relies on the use of Hidden Markov Models (HMM) to provide initial text baselines hypotheses, followed by user review in order to produce ground-truth quality results. Using the set of revised baselines as ground truth, the HMM's are re-trained before processing the next...
Multiple Sequence Alignment is an NP-hard problem. The complexity of finding the optimal alignment is O(LN) where L is the length of the longest sequence and N is the number of sequences. Hence the optimal solution is nearly impossible for most of the datasets. Progressive alignment solves MSA in very economic complexity but does not provide accurate solutions because there is a tradeoff between accuracy...
Fuzzy clustering has been extensively used in brain magnetic resonance (MR) image segmentation. However, due to the existence of noise and intensity inhomogeneity, many segmentation algorithms suffer from limited accuracy. In this paper, we propose a fuzzy clustering algorithm via enhanced spatially constraint for brain MR image segmentation. A novel spatial factor is proposed by incorporating the...
In this article, a non-signature based statistical scanner for metamorphic malware detection, employing feature ranking methods like Term Frequency-Inverse Document Frequency-Class Frequency (TF-IDF-CF), Galavotti-Sebastiani-Simi Coefficient (GSS), Term Significance (TS) and Odds Ratio (OR) is proposed. Malware and benign models for classification are created by considering top ranked features obtained...
Missing data theory has recently been used as a solution to noise robustness issue in Automatic Speech Recognition (ASR). Missing components of spectrogram can either be reconstructed, as carried out in Spectral Imputation, or simply ignored, as done in classifier modification. Most of the research has been focused on imputation because of the problems associated with classifier modification approaches...
This paper introduces extensions of a previously proposed range-dependent modified Gilbert model for generation of realistic error patterns. With the proposed extensions our model can be used to generate signal-to-noise ratio trends corresponding to the error patterns for infrastructure-to-vehicle communications. The model parameters are estimated based on realistic highway measurements using off-the-shelf...
Motif detection has raised as an important task in bioinformatics. Recently, the discovery of motifs that are localized relative to a certain biological area has become an important task in many applications. For example, it is used to discover regulatory sequences beside the transcription start site and the neighborhood of known transcription factor binding sites [1]. Therefore, the idea of context...
Sleep has been shown to be imperative for the health and well-being of an individual. To design intelligent sleep management tools, such as the music-induce sleep-aid device, automatic detection of sleep onset is critical. In this work, we propose a simple yet accurate method for sleep onset prediction, which merely relies on Electroencephalogram (EEG) signal acquired from a single frontal electrode...
Recent advances in technology have enabled automatic cardiac auscultation using digital stethoscopes. This in turn creates the need for development of algorithms capable of automatic segmentation of heart sounds. Pediatric heart sound segmentation is a challenging task due to various confounding factors including the significant influence of respiration on children's heart sounds. The current work...
The problem of a correct fall risk assessment is becoming more and more critical with the ageing of the population. In spite of the available approaches allowing a quantitative analysis of the human movement control system's performance, the clinical assessment and diagnostic approach to fall risk assessment still relies mostly on non-quantitative exams, such as clinical scales. This work documents...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.