The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Single camera-based multiple-person tracking is often hindered by difficulties such as occlusion and changes in appearance. In this paper, we address such problems by proposing a robust part-based tracking-by-detection framework. Human detection using part models has become quite popular, yet its extension in tracking has not been fully explored. Our approach learns part-based person-specific SVM...
We propose an unsupervised image segmentation method based on texton similarity and mode seeking. The input image is first convolved with a filter-bank, followed by soft clustering on its filter response to generate textons. The input image is then superpixelized where each belonging pixel is regarded as a voter and a soft voting histogram is constructed for each superpixel by averaging its voters'...
Human action recognition is an important yet challenging task. The recently developed commodity depth sensors open up new possibilities of dealing with this problem but also present some unique challenges. The depth maps captured by the depth cameras are very noisy and the 3D positions of the tracked joints may be completely wrong if serious occlusions occur, which increases the intra-class variations...
Human action recognition in videos draws strong research interest in computer vision because of its promising applications for video surveillance, video annotation, interactive gaming, etc. However, the amount of video data containing human actions is increasing exponentially, which makes the management of these resources a challenging task. Given a database with huge volumes of unlabeled videos,...
In this paper, we address a practical problem of cross-scenario clothing retrieval — given a daily human photo captured in general environment, e.g., on street, finding similar clothing in online shops, where the photos are captured more professionally and with clean background. There are large discrepancies between daily photo scenario and online shopping scenario. We first propose to alleviate the...
Despite significant recent progress, the best available visual saliency models still lag behind human performance in predicting eye fixations in free-viewing of natural scenes. Majority of models are based on low-level visual features and the importance of top-down factors has not yet been fully explored or modeled. Here, we combine low-level features such as orientation, color, intensity, saliency...
To fully utilize the social potential of virtual environments, support for seamless and automatic integration of nonverbal communication is essential. In this paper we propose a conceptual and architectural design for a framework, which provides an integrated, flexible environment for the cooperation of heterogeneous modules specialized in different aspects of the acquirement, analysis and presentation...
This paper focused on unique concept of extracting the gait features of walking human from sequences of silhouette images for recognition purpose. Discrete Cosine Transform (DCT) was evaluated as feature extraction solely followed by combination with Principal Component Analysis (PCA) as feature selection. Then, the entire feature vectors that were extracted are used as input to classify using artificial...
Pedestrian detection from images is an important and yet challenging task. The conventional methods usually identify human figures using image features inside the local regions. In this paper we present that, besides the local features, context cues in the neighborhood provide important constraints that are not yet well utilized. We propose a framework to incorporate the context constraints for detection...
While activity recognition is a current focus of research the challenging problem of fine-grained activity recognition is largely overlooked. We thus propose a novel database of 65 cooking activities, continuously recorded in a realistic setting. Activities are distinguished by fine-grained body motions that have low inter-class variability and high intra-class variability due to diverse subjects...
Human activity recognition has potential to impact a wide range of applications from surveillance to human computer interfaces to content based video retrieval. Recently, the rapid development of inexpensive depth sensors (e.g. Microsoft Kinect) provides adequate accuracy for real-time full-body human tracking for activity recognition applications. In this paper, we create a complex human activity...
Much of the existing work on action recognition combines simple features (e.g., joint angle trajectories, optical flow, spatio-temporal video features) with somewhat complex classifiers or dynamical models (e.g., kernel SVMs, HMMs, LDSs, deep belief networks). Although successful, these approaches represent an action with a set of parameters that usually do not have any physical meaning. As a consequence,...
Understanding natural human activity involves not only identifying the action being performed, but also locating the semantic elements of the scene and describing the person's interaction with them. We present a system that is able to recognize complex, fine-grained human actions involving the manipulation of objects in realistic action sequences. Our method takes advantage of recent advances in sensors...
A driver assistance system realizes that the driver is distracted and that a potentially hazardous situation is emerging. Where should it guide the attention of the driver? Optimally to the spot that allows the driver to make the best decision. Pedestrian detectability has been proposed recently as a measure of the probability that a driver perceives pedestrians in an image [9]. Leveraging this information...
We present an approach to automatically learn the visual appearance of an environment in terms of object classes. The procedure is totally unsupervised, incremental, and can be executed in real time. The traversability property of an unseen object is also learnt without human supervision by the interaction between the robot and the environment. An incremental version of affinity propagation, a state-of-the-art...
It is crucial to get human hand information for hand gesture recognition tasks. However, at present, people can not still get a perfect hand segmentation or localize hand accurately especially under complex conditions. Therefore, it is necessary to develop robust and effective methods for detecting human hand accurately. In this paper, we propose a new method for hand detection. We present an extended...
In this paper, a novel sparse feature representation method for object tracking is proposed. The method is on the observation that a tracked object can be dynamically and compactly represented by a few features (sparse representation) from a large feature set (the improved histogram of oriented gradient and color, HOGC). Based on the HOGC features, the sparse representation can be learned online from...
For robots of the future to interact seamlessly with humans, they must be able to reason about their surroundings and take actions that are appropriate to the situation. Such reasoning is only possible when the robot has knowledge of how the World functions, which must either be learned or hard-coded. In this paper, we propose an approach that exploits language as an important resource of high-level...
This article presents a robust, real-time background subtraction algorithm able to operate properly in complex dynamically changing visual conditions and indoor/outdoor environments, based on a single, cheap monocular camera, like a webcam. This algorithm uses an image grid and models each pixel of the grid as a mixture of adaptive Student-t distributions. This approach makes this algorithm robust...
Detecting pedestrian accurately from natural scenes makes the important impact on intelligent video surveillance. In this paper, we combine motion information, human skin color information, human shape information and variation of ambient lighting to detect pedestrians for the application of automated video surveillance. The moving objects in the video sequence images are extracted using the multi-frame...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.