The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We present an unsupervised method of learning action symbols from video data, which self-tunes the number of symbols to effectively build hierarchical activity grammars. A video stream is given as a sequence of unlabeled segments. Similar segments are incrementally grouped to form a hierarchical tree structure. The tree is cut into clusters where each cluster is used to train an action symbol. Our...
Basic understanding and recognition of human actions can be accomplished by modeling the spatiotemporal relationship among major skeletal joints. In this work we present an approach that models human actions using temporal causal relations of joint movements. The relations form a graph with joints as nodes and edges induced by the Granger causality measure between pairs of joint point processes. Each...
Trajectory analysis is very important to human behavior-analysis for video processing based smart surveillance systems. It has a challenge that human trajectory has no prior model and needs to online learning and updating, while interaction between targets complicates the problem. This paper describes a novel integrated framework for multiple human trajectory detection, learning and analysis in complicated...
In this paper, we further develop the research on recognition of activities, in videos recorded with wearable cameras, with Hierarchical Hidden Markov Model classifiers. The visual scenes being of a strong complexity in terms of motion and visual content, good performances have been obtained using multiple visual and audio cues. The adequate fusion of features from physically different description...
Early Recognition of human activities is a highly desirable functionality for many visual intelligent systems. However, in computer vision, very few work have been devoted to this challenging and interesting task. In this paper, we address human activity early recognition as a pattern recognition problem of time series data. A new model called ARMA-HMM is introduced to integrate both the predictive...
Facial emotion recognition-the detection of emotion states from video of facial expressions-has applications in video games, medicine, and affective computing. While there have been many advances, an approach has yet to be revealed that performs well on the non-trivial Audio/Visual Emotion Challenge 2011 data set. A majority of approaches still employ single frame classification, or temporally aggregate...
Ground-penetrating radar systems are useful for a variety scientific studies, including monitoring changes to the polar ice sheets that may give clues to climate change. A key step in analyzing radar echograms is to identify boundaries between layers of material (such as air, ice, rock, etc.). In this paper, we propose an automated technique for identifying these boundaries, posing this as an inference...
This paper examines a new problem in large scale stream data: abnormality detection which is localized to a data segmentation process. Unlike traditional abnormality detection methods which typically build one unified model across data stream, we propose that building multiple detection models focused on different coherent sections of the video stream would result in better detection performance....
We propose a novel method for automatic detection of the transport mode of a person carrying a Smart-phone. Existing approaches assume idealized positioning data with no GPS signal losses, require information from additional external sources such as real time bus locations, or only allow for a coarse distinction between very few categories (e.g. ‘still’, ‘walk’, ‘motorized’). Our approach is designed...
Offline Arabic handwritten text recognition task exhibits high variations in observed variables such as size, loops, slant and continuity. Learning algorithm tries to capture the statistical dependence between these variables but often fails to learn the complete distribution because of their large degree-of-freedom. However, it is possible to output a good hypothesis if either data samples for training...
This paper presents a new unsupervised statistical model for human activity discovery and recognition in pervasive environments. The activities are encoded in sequences recorded by non-intrusive sensors disseminated in the environment. Our model studies the relationship between the activities and the sequential patterns from the sequence analysis perspective. Activity discovery is formulated as an...
In this paper, we have proposed a novel approach to recognize the human hand/arm actions in the context of gesture recognition. The main idea is to model the flow information through mixture of Gaussians, perform skin-based Gaussian pruning, and to compute interlevel linking of non-pruned Gaussians using Kullback-Leibler (KL) divergence. Next, we have computed the temporal features from the matched...
This paper proposes a new Probabilistic Graphical Model (PGM) to incorporate the scene, event object interaction and the event temporal contexts into Dynamic Bayesian Networks (DBNs) for event recognition in surveillance videos. We first construct the event DBNs for modeling the events from their own appearance and kinematic observations, and then extend the DBN to incorporate the contexts for boosting...
We investigate the application of structured output learning (SOL) in automatic annotation of court games. We formulate the problem of event classification in court games as one of learning a mapping from features to structured labels, and employ structured SVM to achieve a max-margin solution. We compare closely the more popular generative approach based on the hidden Markov model (HMM) with our...
In gait recognition field, template-based approaches such as Gait Energy Image (GEI) and Chrono-Gait Image (CGI) can achieve good recognition performance with low computational cost. Meanwhile, CGI can preserve temporal information better than GEI. However, they pay less attention to the local shape features. To preserve temporal information and generate more abundant local shape features, we generate...
This paper proposes a novel accident prediction approach based on extracting the relation between interested vehicles and increasing risk factor according to anomaly detection in real time traffic videos. In learning process of the traffic model at intersections, we detect all trajectories by tracking of each vehicle and then group them considering road model. All trajectories are clustered by Continuous...
This paper proposed an unsupervised learning method to learn speech features based on Dynamic Bayesian Networks (DBNs) that accounts for the spatiotemporal dependences in speech signal. Although deep networks have been successfully applied to unsupervised learning features, the structures of the deep networks are often fixed before learning and they fail to capture temporal representation. In this...
The security of web services is nowadays one of the major concerns for Internet users. Web services may manage confidential information, monetary transactions, or even health-critical systems, such as those employed in public airports or hospitals. A key problem of web services is that they should work as expected even in the presence of malicious inputs. Unfortunately, with the increasing complexity...
Video highlight recognition is the procedure in which a long video sequence is summarized into a shorter video clip that depicts the most “salient” parts of the sequence. It is an important technique for content delivery systems and search systems which create multimedia content tailored to their users' needs. This paper deals specifically with capturing highlights inherent to sports videos, especially...
This paper presents a real-time smoke detection algorithm. A modified Center Symmetric Local Ternary Pattern (CS-LTP) is proposed as the smoke texture descriptor. The change in background texture provides a means to differentiate between smoke and non-smoke region. Combined with the color information of smoke, our method is able to achieve real-time performance at minimum of 30 fps. The comparison...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.