The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The challenge of markerless human motion tracking is the high dimensionality of the search space. Thus, efficient exploration in the search space is of great significance. In this paper, a motion capturing algorithm is proposed for upper body motion tracking. The proposed system tracks human motion based on monocular silhouette-matching, and it is built on the top of a hierarchical particle filter,...
We propose a novel human-robot-interaction framework for robust visual scene understanding. Without any a-priori knowledge about the objects, the task of the robot is to correctly enumerate how many of them are in the scene and segment them from the background. Our approach builds on top of state-of-the-art computer vision methods, generating object hypotheses through segmentation. This process is...
Recognizing human activities from common color image sequences faces many challenges, such as complex backgrounds, camera motion, and illumination changes. In this paper, we propose a new 4-dimensional (4D) local spatio-temporal feature that combines both intensity and depth information. The feature detector applies separate filters along the 3D spatial dimensions and the 1D temporal dimension to...
We propose a method for interactive modeling of objects and object relations based on real-time segmentation of video sequences. In interaction with a human, the robot can perform multi-object segmentation through principled modeling of physical constraints. The key contribution is an efficient multi-labeling framework, that allows object modeling and disambiguation in natural scenes. Object modeling...
In this paper we propose a new method for 3D facial expression recognition. We make use of the Zernike moments, which are calculated in the depth image of a 3D facial point cloud. Combining, the Zernike moments along with the 3D point clouds and the depth images, we succeed in tackling problems arising in facial expression recognition due to affine transformations of the data, such as translation,...
This paper proposes a two-level 3D human pose tracking method for a specific action captured by several cameras. The generation of pose estimates relies on fitting a 3D articulated model on a Visual Hull generated from the input images. First, an initial pose estimate is constrained by a low dimensional manifold learnt by Temporal Laplacian Eigenmaps. Then, an improved global pose is calculated by...
We present a Z-SIFT based 3D surface registration algorithm that utilizes the depth information enhanced SIFT features to make initial alignment and the 2D feature weighted Iterative Closest Point (ICP) algorithm to realize accurate registration. The combination of SIFT features and depth information extracts faithful corresponding points between the 2D images and provides good coarse alignment for...
We propose a novel and simple framework that solves two popular problems in digital photography: 2D face synthesis and 3D face modeling. 2D face synthesis aims at creating a new face, usually by mixing two or more portraits. We extend this notion to the combination of human and statue faces. The goal of 3D face modeling is to reconstruct a face in three dimensions from one or several images. These...
Given the ease that humans have with using a keyboard and mouse in typical, non-colocated computer interaction, many studies have investigated the value of co-locating the visual field and haptic workspaces using immersive virtual reality (VR) modalities. Significant understanding has been gained by previous work comparing physical tasks against VR tasks, visuo-haptic co-location versus non-colocation,...
This paper describes a human shape reconstruction method from multiple cameras in daily living environment, which leads to robust markerless motion capture. Due to continual illumination changes in daily space, it had been difficult to get human shape by background subtraction methods. Recent statistical foreground segmentation techniques based on graph-cuts, which combine background subtraction information...
This paper proposes a robot that acquires multi-modal information, i.e. auditory, visual, and haptic information, fully autonomous way using its embodiment. We also propose an online algorithm of multimodal categorization based on the acquired multimodal information and words, which are partially given by human users. The proposed framework makes it possible for the robot to learn object concepts...
In this paper we propose a framework for activity recognition based on space-time interest point in video surveillance. Single type interest point feature is not sufficient to identify the activity therefore we have considered multi-class activities fussed in three dimensional (spatial & time) coordinate to achieve our objective with maximum accuracy. Our experiment shows that fusing multi class...
Mine vehicles are a leading cause of mining fatalities. A reliable anti-collision system is needed to prevent vehicle-personnel collisions. The proposed collision detection system uses the fusion of a three-dimensional (3D) sensor and thermal infrared camera for human detection and tracking. In addition to a thermal camera, a distance sensor will provide depth information and allow the calculation...
Colour plus depth map based stereoscopic video has attracted significant attention in the last 10 years, as it can reduce storage and bandwidth requirements for the transmission of stereoscopic content over wireless channels such as mobile networks. However, quality assessment of coded 3D video sequence can currently be performed reliably using expensive and inconvenient subjective tests [1]. The...
We propose a human motion tracking method for fast motion clips using synchronized multiple cameras. Our method is capable of extracting 3D articulated postures with 42 degrees of freedom through a sequence of visual hulls. We seek for the globally optimal solutions of the likelihood with the local memorization about the “fitness” of each body segment. Our method avoids the local minimum problem efficiently...
We present a high-accuracy shape acquisition approach that uses a combined Gray code and phase-shifting structured light projection. The Gray code pattern coarsely divides the projected plane into local regions, whereas phase-shifting precisely determines the pixel-accurate index for each point on the plane. The combination of the two codifications greatly decreases the number of patterns as compared...
In this paper, we propose a novel gait recognition framework which is Spherical Space Model with Human Point Clouds (SSM-HPC). A new gait representation is also introduced, which is called Marching in Place (MIP) gait and preserves the spatiotemporal characteristics of individual gait manner. Various researches for gait recognition have used human silhouette images from moving picture. This research...
Color and depth play important roles in natural scenes and in vision, and their perception is related. Extensive work has been conducted on studying the luminance statistics of natural scenes; however, there is very little work done on analyzing the statistics between luminance and range in natural scenes, not to mention color and range. In this paper, we present the LIVE Color+3D Database, which...
This paper addresses the problem of evaluating virtual view synthesized images in the multi-view video context. As a matter of fact, view synthesis brings new types of distortion. The question refers to the ability of the traditional used objective metrics to assess synthesized views quality, considering the new types of artifacts. The experiments conducted to determine their reliability consist in...
In this paper, we propose a 3D template-based human action detection base on volume pattern matching. A volume pattern is obtained by detecting the principal plane from a space-time patch using the 3D moment-preserving technique. Instead of segmentation and detailed shape representation, the objective of this research is to develop and apply computer vision methods that explore the structure of a...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.