The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper proposes an annealed particle swarm optimization based particle filter algorithm for articulated 3D human body tracking. In our algorithm, a sampling covariance and an annealing factor are incorporated into the velocity updating equation of particle swarm optimization (PSO). The sampling covariance and the annealing factor are initiated with appropriate values at the beginning of the PSO...
Natural human-robot interaction requires leveraging viewing direction information in order to recognize, respond to, and even emulate human behavior. Knowledge of the eye gaze and point of regard gives us insight into what the subject is interested in and/or who the subject is addressing. In this paper, we present a novel eye gaze estimation approach for point-of-regard (PoG) tracking. To allow for...
The computational capability of mobile phones has been rapidly increasing, to the point where augmented reality has become feasible on cell phones. We present an approach to indoor localization and pose estimation in order to support augmented reality applications on a mobile phone platform. Using the embedded camera, the application localizes the device in a familiar environment and determines its...
This paper presents a 3D computer vision method that assists the tedious procedure of manually reconstructing ceramic vessels from fragments unearthed in an archaeological excavation. This computational method relies on vessel surface markings combined with expert feedback (via the archaeologist) to form a generic model of a vessel that the excavated fragments might have originated from. Prior expert...
We propose a real-time method for counting pedestrians and bicyclists by classifying bulks of asynchronous events generated upon scene activities by an event-based 3D dynamic vision system. The inherent detection of moving objects offered by the 3D dynamic vision system comprising a pair of dynamic vision sensors allows event-based stereo vision in real-time and a 3D representation of moving objects...
Human-robot interaction necessitates more than robust people detection and tracking. It relies on the integration of disparate scene information from tracking and recognition systems combined and infused with current and prior knowledge to facililtate robotic understanding and interaction with humans and the environment. In this work we will discuss our efforts in the development and integration of...
Recent studies have shown that 3D imaging provides some unique advantages over traditional 2D imaging for minimal invasive surgery. However, most existing endoscopes still use single-lens cameras, and the use of dual-lens 3D imaging techniques is still limited. This paper proposes an approach to enabling 3D imaging from a single-lens endoscope by automatically synthesizing stereoscopic views from...
We estimate and track articulated human poses in sequences from a single view, real-time range sensor. We use a data driven MCMC approach to find an optimal pose based on a likelihood that compares synthesized depth images to the observed depth image. To speed up convergence of this search, we make use of bottom up detectors that generate candidate head, hand and forearm locations. Our Markov chain...
Current experiments with HCIs have shown a high demand for more natural interaction paradigms. Gestures are thereby considered the most important cue besides speech. In order to recognize gestures it is necessary to extract meaningful motion features from the body. Up to now mostly marker based tracking systems are used in virtual reality environments, since these were traditionally more reliable...
Generating statistically significant datasets for face matching system evaluation is a laborious and expensive process. Capturing variables such as atmospheric turbulence and other weather conditions especially with respect to face recognition at a distance exacerbate the problem further. It is even more difficult to work on system issues for long-range systems that impact the collection phase such...
We present a novel viewpoint which approaches the structural correspondence across an image stack in the 3D space as solving a contour grouping problem. Finding 3D cellular tubes becomes finding closed contours. We derive grouping cues between cells in adjacent slices based on their ability to relate in the 3D space. Those that form a long 3D tube in the space become the most salient contour, while...
Two dimensional shape models have been successfully applied to solve many problems in computer vision such as object tracking, recognition and segmentation. Typically, 2D shape models (e.g. Point Distribution Models, Active Shape Models) are learned from a discrete set of image landmarks once the rigid transformations are removed applying Procrustes Analysis (PA). However, the standard PA process...
In human facial behavioral analysis, Action Unit (AU) coding is a powerful instrument to cope with the diversity of facial expressions. Almost all of the work in the literature for facial action recognition is based on 2D camera images. Given the performance limitations in AU detection with 2D data, 3D facial surface information appears as a viable alternative. 3D systems capture true facial surface...
This paper presents a method to assist in the tedious procedure of reconstructing ceramic vessels from unearthed archaeological shards or fragments using 3D computer vision-enabling technologies. The method uses vessels surface markings combined with a generic model to produce a representation of what the original vessel may have looked like. Generic vessel models used are based on a host of factors...
Multiple camera views of a scene are utilized to detect and reconstruct object surfaces in three dimensions. Special attention is paid to the reconstruction of occluded objects which are only partially visible. Input images can be obtained from either an array of cameras or a single moving camera. The formulation is based on a capture and display technique developed in the optics community. Various...
The assembly of fragments into vessels is a significant task in the analysis of archaeological finds. The current method of reconstruction which relies on experts is time-consuming and laborious, and leads only to a fraction of reconstructions possible. Automated tools have been able to assemble at most two or three dozen fragments, while in practice, archaeologists deal with hundreds and thousands...
This paper proposes a new method for estimating and maintaining over time the pose of a single Pan-Tilt-Zoom camera (PTZ). This is achieved firstly by building offline a keypoints database of the scene; then, in the online step, a coarse localization is obtained from camera odometry and finally refined by visual landmarks matching. A maintenance step is also performed at runtime to keep updated the...
Numerical methods used for solving differential equations should be chosen with great care. Not considering numerical aspects such as stability, consistency and wellposed-ness results in erroneous solutions, which in turn will result in incorrect judgments. One of the most important aspects that should be considered is the stability of the numerical method. In this paper, we discuss stability problems...
This paper presents a view-invariant approach to gait recognition in multi-camera scenarios exploiting a joint spatio-temporal data representation and analysis. First, multi-view information is employed to generate a 3D voxel reconstruction of the scene under study. The analyzed subject is tracked and its centroid and orientation allow recentering and aligning the volume associated to it, thus obtaining...
We recognize actions and activities in video sequences as distinguishing patterns in the 3D spatiotemporal volume of motion energy. Local motion descriptors, which capture highly discriminative invariant motion characteristics in a spherical neighborhood, are computed in the 3D volume at points of salient motion to represent actions or activities in video sequences. Two actions are then matched based...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.