The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The gesture recognition has raised attention in computer vision owing to its many applications. However, video-based large-scale gesture recognition still faces many challenges, since many factors like background may disturb the accuracy. To achieve gesture recognition with large-scale videos, we propose a method based on RGB-D data. To learn gesture details better, the inputs are expanded into 32-frame...
Although emotional state recognition from voice has been extensively studied, there is not much effort focusing on the online emotion recognition. Since duration and intensity of emotional experiences change over time it is hard to employ existing static transition models while monitoring emotional states especially in an online setting. To overcome this difficulty we introduce a method which incorporates...
3D models of outdoor environments have been used for several applications such as a virtual earth system and a vision-based vehicle safety system. 3D data for constructing such 3D models are often measured by an on-vehicle system equipped with laser rangefinders, cameras, and GPS/IMU. However, 3D data of moving objects on streets lead to inaccurate 3D models when modeling outdoor environments. To...
In this paper we propose an automatic urban building extraction method for oblique aerial images. Five steps are included in this method: point cloud generation, grid partition, feature extraction, building detection and building reconstruction. Taking advantages of recent progress in large-scale Structure from Motion (SfM) and Multiple View Stereo (MVS), dense point cloud is generated first. Then,...
We propose a novel method for the recognition of objects that match a given 3D model in large-scale scene point clouds captured in indoor environments with a laser range finder. Since large-scale indoor point clouds are greatly damaged by noise such as clutter, occlusion, hole, and measurement errors, it is difficult to exactly identify local correspondences between points in a target model point...
Inferring scene depth from a single monocular image is an essential component in several computer vision applications such as 3D modeling and robotics. This process is an ill-posed problem. To tackle this challenging problem, previous efforts have been focusing on exploiting only global or local depth aware properties. We propose a model that incorporates both of them to obtain significantly more...
In this paper, we propose a new and effective frontalization algorithm for frontal rendering of unconstrained face images, and experiment it for face recognition. Initially, a 3DMM is fit to the image, and an interpolating function maps each pixel inside the face region on the image to the 3D model's. Thus, we can render a frontal view without introducing artifacts in the final image thanks to the...
In this paper, we address the problem of online RGB-D tracking where the target object undergoes significant appearance changes. To sufficiently exploit the color and depth cues, we propose a novel RGB-D tracking framework (DLS) that simultaneously builds the target 2D appearance model and 3D distribution model. The framework decomposes the tracking task into detection, learning and segmentation....
This paper presents a photometric stereo method with nonisotropic point light sources. Subject to the non-uniform lighting conditions produced by the nonisotropic point sources, each incident light ray should be precisely determined so as to realize an accurate calculation of surface normal. In the proposed method, radiance model of the light source is firstly introduced to the classical photometric...
We present a Bayesian framework for estimating 3D human pose and camera from a single RGB image. We develop a generative model where a 3D pose is rendered onto an image (via the camera), which then generates a detection probability map for each body part. We represent a human pose with a set of 3D cylinders in space, one for each body part, and we place kinematic and self-intersection priors on the...
In this paper, we propose a robust method for face reconstruction using a single color image. A 3D morphable model is used to reconstruct a smooth 3D face shape. To find the correspondence between model vertices and image pixels, landmarks are updated using SIFT flow which is illumination and rotation invariant. To reconstruct more detailed information, depth values are refined using a shape from...
This paper addresses the problem of modeling long-range motion patterns of a 3D human skeleton performing an activity. This problem is important, as such a model can be used in many applications, including person tracking via 3D pose estimation, and probabilistic sampling of realistic 3D skeleton sequences conducting different activities with different motion styles. To this end, we formulate a new...
We introduce a novel robust hybrid 3D face tracking framework from RGBD video streams, which is capable of tracking head pose and facial actions without pre-calibration or intervention from a user. In particular, we emphasize on improving the tracking performance in instances where the tracked subject is at a large distance from the cameras, and the quality of point cloud deteriorates severely. This...
In this paper, we propose a saliency detection model for RGB-D images based on the contrasting features of color and depth within a Bayesian framework. The depth feature map is extracted based on superpixel contrast computation with spatial priors. We model the depth saliency map by approximating the density of depth-based contrast features using a Gaussian distribution. Similar to the depth saliency...
Visual texture modeling based on multidimensional mathematical models is the prerequisite for both robust material recognition as well as for image restoration, compression or numerous physically correct virtual reality applications. A novel multispectral visual texture modeling method based on a descriptive, unusually complex, three-dimensional, spatial Gaussian mixture model is presented. Texture...
This paper proposes a new method for mapping volume models of human organs onto a target volume with simple shapes. The proposed method is based on our modified Self-organizing Deformable Model (mSDM) which finds the one-to-one mapping with no foldovers between an arbitrary object surface model and a target surface. By extending mSDM to apply to organ volume models, the proposed method, called volumetric...
In this paper, we propose a new camera model for reconstructing 3D objects under light ray distortion caused by refractive medias. The proposed method can reconstruct 3D scene, even if light rays projected into the cameras are refracted by the refractive media, such as glasses and raindrops. For this objective, we represent light ray projection of multiple cameras by using a pair of planes shared...
We aim to reconstruct an accurate neutral 3D face model from an RGB-D video in the presence of extreme expression changes. Since each depth frame, taken by a low-cost sensor, is noisy, point clouds from multiple frames can be registered and aggregated to build an accurate 3D model. However, direct aggregation of multiple data produces erroneous results in natural interaction (e.g., talking and showing...
A workflow is proposed for Cultural Heritage applications in which the fusion of 3D and 2D visual data is required. Using data acquired by cheap, standard devices, like a 3D scanner having a low quality 2D camera in it, and a high resolution DSLR camera, one can produce high quality color calibrated 3D model for documenting purpose. The proposed processing workflow combines a novel region based calibration...
An automatic lookup tool, which matches and retrieves similar floorplans from a large repository of digitized architectural floorplans can prove to be of immense help for the architects while designing new projects. In this paper, we have proposed a framework for the matching and retrieval of similar architectural floorplans under the query by example paradigm. We propose a room layout segmentation...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.