The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Hand Gesture Recognition is completed on top-view hand images observed by a Time of Flight(ToF) camera in a car. The work attempts to solve two important problems of touchless interactions inside a car. First, low latency identification of the gestures which are unobtrusive for the driver. Second, reducing the labelled data required to train learning based solutions, this is particularly important...
Introducing features that better represent the visual information of speakers during the speech production is still an open issue that highly affects the quality of the lip-reading and Audio Visual Speech Recognition (AVSR) tasks. In this paper, three different types of visual features from both the image-based and model-based ones are investigated inside a professional lip reading task. The simple...
This paper proposes a novel approach for human activity recognition based on body part histograms and Hidden Markov Models. From a depth video frame, body parts are segmented first using a trained random forest. Then, a histogram for each body part is combined to represent histogram features for a depth image. The depth video activity features are then applied on hidden Markov models for training...
In this paper, we propose a large vocabulary Mongolian offline handwriting recognition system, using hidden Markov models (HMMs)-deep neural networks (DNN) hybrid architectures which shows superior performance on auto speech recognize (ASR) tasks. We select 50 sub-characters from all shape of Mongolian letters as the smallest modeling unit. First, a set of intensity features are extracted from each...
In this paper we will present our investigations related to contextual modeling for HMM-based handwritten Arabic text recognition. We will, first, discuss the justifications and the need for contextual modeling for handwritten Arabic text recognition. Next, we will discuss the issues related to contextual modeling for Arabic text recognition. Finally, we will present our novel class-based contextual...
A frequency count based two stage classification approach is proposed by combining generative and discriminative modeling principles for online handwritten character recognition. The first stage classifier based on Hidden Markov Model (HMM) returns top-K ranking characters out of the total N classes. In the second stage, pairwise classifiers for K(K − 1)/2 unique combinations of top-K characters using...
This paper deals with robust modelling of mouth shapes in the context of sign language recognition using deep convolutional neural networks. Sign language mouth shapes are difficult to annotate and thus hardly any publicly available annotations exist. As such, this work exploits related information sources as weak supervision. Humans mainly look at the face during sign language communication, where...
This paper proposes a non-Gaussian approach for biosignal classification based on the Johnson SU translation system. The Johnson system is a normalizing translation that transforms data without normality to normal distribution using four parameters, thereby enabling the representation of a wide range of shapes for marginal distribution with skewness and kurtosis. In this study, a discriminative model...
This paper addresses the issue of automatic classification of the six universal emotional categories (joy, surprise, fear, anger, disgust, sadness) in the case of static images. Appearance parameters are extracted by an active appearance model(AAM) representing the input for the classification step. We show how Relevant Component Analysis (RCA) in combination with Fisher's Linear Discriminant (FLD)...
Recent development in depth sensors opens up new challenging task in the field of computer vision research areas, including human-computer interaction, computer games and surveillance systems. This paper addresses shape and motion features approach to observe, track and recognize human silhouettes using a sequence of RGB-D images. Under our proposed activity recognition framework, the required procedure...
Hidden Markov Models (HMM) are used in handwritten strokes recognition task. The two design parameters of HMM are the number of states and number of mixtures in each state. There are two approaches for finding the number of states, namely, equal number of states and variable number of states. Since the shape of strokes will be different, variable number of states approach should be beneficial. This...
This paper proposes the improvement of context dependent modeling for Arabic handwriting recognition. Since the number of parameters in context dependent models is huge, CART trees are used for state tying. This work is based on a new set of questions for the CART tree construction based on a "lossy mapping" categorization of the Arabic shapes. The used system is a combination of Hidden...
Sub-character HMM models for Arabic text recognition allow sharing of common patterns between different position-dependent shape forms of an Arabic character as well as between different characters. The number of HMMs gets reduced considerably while still capturing the variations in shape patterns. This results in a compact, efficient, and robust recognizer with reduced model set. In the current paper...
In this paper, we propose a method for spotting keywords in images of handwritten text. Relying on an object detection system in real images, local contour features are extracted from segmented word images in order to obtain a representative shape of a word-class. Thus, word spotting is cast following a query-by-word-class scenario where class models are generated using a random subset of the images...
This paper presents a new symbol segmentation method based on AdaBoost with confidence weighted predictions for online handwritten mathematical expressions. The handwritten mathematical expression is preprocessed and rendered to an image. Then for each stroke, we compute three kinds of shape context features (stroke pair, local neighborhood and global shape contexts) with different scales, 21 stroke...
HMM-based analytical methods have been widely used for Arabic handwriting recognition. A key factor influencing the performance of HMM-based systems is the features extracted from a sliding window. In this paper, we propose a novel baseline-independent feature set extracted from a wider sliding window to directly capture the contextual information. This feature set is a combination of center of mass...
This paper proposes a novel way of controllable pitch re-estimation that can produce better pitch contour or provide diverse speaking styles for text-to-speech (TTS) systems. The method is composed of a pitch re-estimation model and a set of control parameters. The pitch re-estimation model is employed to reduce over-smoothing effects which is usually introduced by TTS training. The control parameters...
State estimation and control are intimately related processes in robot handling of flexible and articulated objects. While for rigid objects, we can generate a CAD model before-hand and a state estimation boils down to estimation of pose or velocity of the object, in case of flexible and articulated objects, such as a cloth, the representation of the object's state is heavily dependent on the task...
Named entity recognition (NER) is the task of segmenting and classifying occurrences of names in text. In NER, local contextual cues provide important evidence, but non-local information from the whole document could also prove useful: for example, it is useful to know that “Mary Kay Inc.” has been mentioned in a document to classify subsequent mentions of “Mary Kay” as an organization and not as...
This paper proposes a novel graph-based method for representing a human's shape during the performance of an action. Despite their strong representational power, graphs are computationally cumbersome for pattern analysis. One way of circumventing this problem is that of transforming the graphs into a vector space by means of graph embedding. Such an embedding can be conveniently obtained by way of...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.