The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this work, we propose a novel human activity recognition method from depth videos using robust spatiotemporal features with convolutional neural network. From the depth images of activities, human body parts are segmented based on random features on a random forest. From the segmented body parts in a depth image of an activity video, spatial features are extracted such as angles of the 3-D body...
What is the right way to reason about human activities? What directions forward are most promising? In this work, we analyze the current state of human activity understanding in videos. The goal of this paper is to examine datasets, evaluation metrics, algorithms, and potential future directions. We look at the qualitative attributes that define activities such as pose variability, brevity, and density...
The temporal component of videos provides an important clue for activity recognition, as a number of activities can be reliably recognized based on the motion information. In view of that, this work proposes a novel temporal stream for two-stream convolutional networks based on images computed from the optical flow magnitude and orientation, named Magnitude-Orientation Stream (MOS), to learn the motion...
This paper addresses the problem of jointly recognizing object fluents and tasks in egocentric videos. Fluents are the changeable attributes of objects. Tasks are goal-oriented human activities which interact with objects and aim to change some attributes of the objects. The process of executing a task is a process to change the object fluents over time. We propose a hierarchical model to represent...
Most of the existing works on human activity analysis focus on recognition or early recognition of the activity labels from complete or partial observations. Predicting the labels of future unobserved activities where no frames of the predicted activities have been observed is a challenging problem, with important applications, which has not been explored much. Associated with the future label prediction...
We formulate a concept of a future smart environment for high quality of life (SEQUAL) that would empower humans to compensate for physical and cognitive disabilities associated with sickness and aging. In SEQUAL the assessment of the state of ‘well-being’ — from behaviors and biological signals — is holistic, meaning that the estimation of individual's health, emotional condition, activity and wishes,...
Activity recognition from first-person (ego-centric) videos has recently gained attention due to the increasing ubiquity of the wearable cameras. There has been a surge of efforts adapting existing feature descriptors and designing new descriptors for the first-person videos. An effective activity recognition system requires selection and use of complementary features and appropriate kernels for each...
We present a system for temporal detection of social interactions. Many of the works until now have succeeded in recognising activities from clipped videos in datasets, but for robotic applications, it is important to be able to move to more realistic data. For this reason, the proposed approach temporally detects intervals where individual or social activity is occurring. Recognition of human activities...
This paper presents a novel activity class representation using a single sequence for training. The contribution of this representation lays on the ability to train an one-shot learning recognition system, useful in new scenarios where capturing and labeling sequences is expensive or impractical. The method uses a universal background model of local descriptors obtained from source databases available...
We consider scenarios in which we wish to perform joint scene understanding, object tracking, activity recognition, and other tasks in scenarios in which multiple people are wearing body-worn cameras while a third-person static camera also captures the scene. To do this, we need to establish person-level correspondences across first-and third-person videos, which is challenging because the camera...
This work is about recognizing human activities occurring in videos at distinct semantic levels, including individual actions, interactions, and group activities. The recognition is realized using a two-level hierarchy of Long Short-Term Memory (LSTM) networks, forming a feed-forward deep architecture, which can be trained end-to-end. In comparison with existing architectures of LSTMs, we make two...
We present an unsupervised representation learning approach that compactly encodes the motion dependencies in videos. Given a pair of images from a video clip, our framework learns to predict the long-term 3D motions. To reduce the complexity of the learning framework, we propose to describe the motion as a sequence of atomic 3D flows computed with RGB-D modality. We use a Recurrent Neural Network...
In computer vision, video-based approaches have been widely explored for the early classification and the prediction of actions or activities. However, it remains unclear whether this modality (as compared to 3D kinematics) can still be reliable for the prediction of human intentions, defined as the overarching goal embedded in an action sequence. Since the same action can be performed with different...
The work presented in this paper deals with the challenging task of learning an activity class representation using a single sequence for training. Recently, Simplex-HMM framework has been shown to be an efficient representation for activity classes, however, it presents high computational costs making it impractical in several situations. A dimensionality reduction of the features spaces based on...
We present a hierarchical recurrent network for understanding team sports activity in image and location sequences. In the hierarchical model, we integrate proposed multiple person-centered features over a temporal sequence based on LSTM's outputs. To achieve this scheme, we introduce the Keeping state in LSTM as one of externally controllable states, and extend the Hierarchical LSTMs to include mechanism...
Sparse representation is widely used by different human activity recognition methods. Although many sparse feature extraction algorithms have been proposed in the literature, most of them focused on low-level features. This paper proposes a new method using trajectories, as mid-level features, for human activity recognition. Even though the use of trajectories is not new in this field, their potential...
Activity recognition applications is growing in importance due to two key factors: first there is increased need for more human assistance and surveillance; and second, increased availability of datasets and improved image recognition algorithms have allowed effective recognition of more sophisticated activities. In this paper we develop an activity recognition approach to support visually impaired...
Human Activity detection is an imperative area of research in computer vision. This paper focuses on activity recognition by construction personnel at the construction sites. The method uses bag of features (BOF) approach to detect an activity. Here we have considered five types of activities done at construction sites namely ladder climbing, brick laying, carpentry work, painting and plastering work...
A novel technique is proposed for categorizing sports events in videos by tracking the positional and angular displacements of the centroid of the moving object in between successive frames. The various sporting events contained in videos are distinguished either by the speed of motion, for instance walking, jogging and running, or by the trajectory made by the human body while in motion, for instance...
In this paper we introduce a general probabilistic graphical model for human everyday activity recognition. The proposed model is a discriminative graphical model with hidden variables for modeling body pose and sequential order of them. We use a unified framework for prediction task that is faster and more efficient than structured support vector machine and hidden conditional random fields. We have...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.