The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Human gesture recognition is one of the central research fields of computer vision, and effective gesture recognition is still challenging up to now. In this paper, we present a pyramidal 3D convolutional network framework for large-scale isolated human gesture recognition. 3D convolutional networks are utilized to learn the spatiotemporal features from gesture video files. Pyramid input is proposed...
The gesture recognition has raised attention in computer vision owing to its many applications. However, video-based large-scale gesture recognition still faces many challenges, since many factors like background may disturb the accuracy. To achieve gesture recognition with large-scale videos, we propose a method based on RGB-D data. To learn gesture details better, the inputs are expanded into 32-frame...
In this paper, we propose using 3D Convolutional Neural Networks for large scale user-independent continuous gesture recognition. We have trained an end-to-end deep network for continuous gesture recognition (jointly learning both the feature representation and the classifier). The network performs three-dimensional (i.e. space-time) convolutions to extract features related to both the appearance...
Active one-shot scanning techniques have been widely used for various applications. Stereo-based active one-shot scanning embeds a positional information regarding the image plane of a projector onto a projected pattern to retrieve correspondences entirely from a captured image. Many combinations of patterns and decoding algorithms for active one-shot scanning have been proposed. If the capturing...
We present a method to reconstruct the three-dimensional shape of a moving instance of a known object category in video data. We exploit state-of-the-art semantic segmentation techniques to extract the object's two-dimensional shape in each frame. Therefore, our method is robust to occlusion, handles stationary objects and extends naturally to multiple video sequences. We apply Structure from Motion...
3D-point set registration is an active area of research in computer vision. In recent years, probabilistic registration approaches have demonstrated superior performance for many challenging applications. Generally, these probabilistic approaches rely on the spatial distribution of the 3D-points, and only recently color information has been integrated into such a framework, significantly improving...
Sleep position is an important feature used to assess the quality and quantity of an individual's sleep. Furthermore, it is related to sleep disorders like sleep apnoea and snoring, and needs to be tracked in nursery homes to avoid pressure ulcers. Therefore, a gravity sensor attached to the chest is generally used to register body position during sleep studies. We suggest a non-intrusive and cost-efficient...
In this paper we aim at increasing the descriptive power of the covariance matrix, limited in capturing linear mutual dependencies between variables only. We present a rigorous and principled mathematical pipeline to recover the kernel trick for computing the covariance matrix, enhancing it to model more complex, non-linear relationships conveyed by the raw data. To this end, we propose Kernelized-COV,...
3D models of outdoor environments have been used for several applications such as a virtual earth system and a vision-based vehicle safety system. 3D data for constructing such 3D models are often measured by an on-vehicle system equipped with laser rangefinders, cameras, and GPS/IMU. However, 3D data of moving objects on streets lead to inaccurate 3D models when modeling outdoor environments. To...
Efficient detection of three dimensional (3D) objects in point clouds is a challenging problem. Performing 3D descriptor matching or 3D scanning-window search with detector are both time-consuming due to the 3-dimensional complexity. One solution is to project 3D point cloud into 2D images and thus transform the 3D detection problem into 2D space, but projection at multiple viewpoints and rotations...
Detection of vehicles in remote sensing data represents a captivating and challenging task that has been studied during many years. The state-of-the-art detection tools can be subdivided into implicit and explicit methods; the latter ones provide detection results by means of some explicitly characterizing features. Mostly, these methods rely on optical aerial images in which vehicles appear distorted...
In this paper we propose an automatic urban building extraction method for oblique aerial images. Five steps are included in this method: point cloud generation, grid partition, feature extraction, building detection and building reconstruction. Taking advantages of recent progress in large-scale Structure from Motion (SfM) and Multiple View Stereo (MVS), dense point cloud is generated first. Then,...
A novel similarity-covariant feature detector that extracts points whose neighborhoods, when treated as a 3D intensity surface, have a saddle-like intensity profile. The saddle condition is verified efficiently by intensity comparisons on two concentric rings that must have exactly two dark-to-bright and two bright-to-dark transitions satisfying certain geometric constraints. Experiments show that...
We propose a novel method for the recognition of objects that match a given 3D model in large-scale scene point clouds captured in indoor environments with a laser range finder. Since large-scale indoor point clouds are greatly damaged by noise such as clutter, occlusion, hole, and measurement errors, it is difficult to exactly identify local correspondences between points in a target model point...
Inferring scene depth from a single monocular image is an essential component in several computer vision applications such as 3D modeling and robotics. This process is an ill-posed problem. To tackle this challenging problem, previous efforts have been focusing on exploiting only global or local depth aware properties. We propose a model that incorporates both of them to obtain significantly more...
In this paper we propose a multi-modal object recognition system that uses a two-step hypothesis verification approach to improve runtime efficiency. The system uses local and global appearance and shape features, generating many possibly competing hypotheses, which are then verified such that the scene can be optimally explained in terms of recognized object models. The introduced modification in...
Human action recognition from videos has wide applicability and receives significant interests. In this work, to better identify spatio-temporal characteristics, we propose a novel 3D extension of Gradient Location and Orientation Histograms, which provides discriminative local features representing not only the gradient orientation, but also their relative locations. We further propose a human action...
In this paper, we propose a new local descriptor for action recognition in depth images. The proposed descriptor relies on surface normals in 4D space of depth, time, spatial coordinates and higher-order partial derivatives of depth values along spatial coordinates. In order to classify actions, we follow the traditional Bag-of-words (BoW) approach, and propose two encoding methods termed Multi-Scale...
The analysis of human gait is more and more investigated due to its large panel of potential applications in various domains, like rehabilitation, deficiency diagnosis, surveillance and movement optimization. In addition, the release of depth sensors offers new opportunities to achieve gait analysis in a non-intrusive context. In this paper, we propose a gait analysis method from depth sequences by...
The ubiquitous hand gesture plays an important role in the natural human machine interaction (HMI). Recently, the consumer color and depth cameras have been used to estimate hand shapes and postures for the mid-air HMI. Under the observation that 3D hand contours possess much information of hand postures, we estimate 3D hand contours from infrared images with a limited computation complexity for the...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.