The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We consider the fully automated behavior understanding through visual cues in industrial environments. In contrast to most existing work, which relies on domain knowledge to construct complex handcrafted features from inputs, we exploit a Convolutional Neural Network (CNN), which is a type of deep model and can act directly on the raw inputs, to automate the process of feature construction. Although...
Mapping an ever changing urban environment is a challenging task as we are generally interested in mapping the static scene and not the dynamic objects, such as cars and people. We propose a novel approach to the problem of dynamic object removal within stereo based scene mapping that is both independent of the underlying stereo approach in use and applicable to varying object and camera motion. By...
We propose optimal rate-allocation, using viewer attention information among viewpoints, for depth map cameras within a free-viewpoint television broadcast system. An attention-weighted rate-allocation framework enables bit-rate, or quality, to be distributed across the multiple cameras in accordance with viewer interest, minimizing total observed distortions perceived among all viewers. Prior work...
We propose a new compression method devoted to large structured hexahedral meshes having discontinuities. It is dedicated to applications such as visualization or physical simulations whose management by any workstation or mobile device with limited memory and bandwidth is critical. Our method relies on a multiresolution analysis that generates a hierarchy of meshes at increasing resolutions. Our...
We propose a software solution which allows the user to design a realistic illumination for a given 2D image of a face. The user paints a few strokes on the image to give clues of desired novel lighting effects. The algorithm produces an image of the face under the best possible realistic illumination, accordingly. It takes advantage of a 3D Morphable Model framework and a state of the art inverse...
A discriminative dictionary learning algorithm is proposed to find sparse signal representations using relative attributes as the available semantic information. In contrast, existing (discriminative) dictionary learning (DDL) approaches mostly utilize binary label information to enhance the discriminative property of the signal reconstruction residual, the sparse coding vectors or both. Compared...
This paper proposes a framework for tracking multiple fluorescent objects in 2D + time video-microscopy. We present a novel batch-processing track-before-detect multiple object tracking approach based on a spatio-temporal marked point process model of ellipses. Our approach takes into account events such as births, deaths, splits and merges of objects which are motivated by the biological and physical...
Regular omnidirectional video encoding technics use map projection to flatten a scene from a spherical shape into one or several 2D shapes. Common projection methods including equirectangular and cubic projection have varying levels of interpolation that create a large number of non-information-carrying pixels that lead to wasted bitrate. In this paper, we propose a tile based omnidirectional video...
We develop an unsupervised graph clustering and image segmentation algorithm based on non-negative matrix factorization. We consider arbitrarily represented visual signals (in 2D or 3D) and use a graph embedding approach for image or point cloud segmentation. We extend a Projective Non-negative Matrix Factorization variant to include local spatial relationships over the image graph. By using properly...
In image compression, block-based transforms tend to be inefficient when blocks contain arbitrarily shaped discontinuities. For this reason, transforms incorporating directional information are an appealing alternative. Starting from the graph Fourier transform, in this paper we present a new transform, called Subspace-Sparsifying Steer-able DCT, that can be obtained by rotating the basis vectors...
In this paper, we propose a novel, full-body, real-time 3D reconstruction framework that makes use of pre-scanned body parts (more precisely pre-scanned 3D heads) so as to provide a more detailed 3D reconstruction mainly in the semantically important head area. Our framework deals with 3 major challenges: a) multiple depth sensors collaboration, b) pre-scanned head positioning and c) reconstruction...
This paper proposes a new temporal consistency measure for quality assessment of synthesized video. Disocclusion regions appear hole regions of the synthesized video at virtual viewpoints. Filling hole regions could be problematic when the synthesized video is perceived through multi-view displays. In particular, the temporal inconsistency caused by hole filling process in view synthesis could affect...
Document is unavailable: This DOI was registered to an article that was not presented by the author(s) at this conference. As per section 8.2.1.B.13 of IEEE's "Publication Services and Products Board Operations Manual," IEEE has chosen to exclude this article from distribution. We regret any inconvenience.
The 3D reconstruction is an essential step to measure the craniofacial morphological changes from the historical growth database with only 2D cephalograms. In this paper, we propose a novel regression-forest-based method to estimate the volumetric intensity images from a lateral cephalogram. The regression forest can produce a prediction of the volumetric craniofacial structure as a mixture of Gaussian...
A cryo Electron Microscopy dataset is composed on tomographic projections of an object (e.g. a macromolecule). The projection orientation information is unknown. The scope of this paper is the projection parameterization in the case of a deformable object. An overview of the parametrization methods is presented. Then a new approach based on manifold learning is detailed. Finally, an evaluation method...
For human identification, facial motion is useful in representing specific dynamic signature. In this paper, we present an effective spatio-temporal representation from facial motion as well as appearance by devising a 3D convolutional neural network (CNN). To maintain the intra-class invariance with limited number of training samples, a multi-task learning approach with human attributes, which are...
This paper investigates the properties of the common self-polar triangle of separate coplanar circles and applies them to camera calibration. We find that any two separate circles have a unique common self-polar triangle. In particular, we show that one vertex of the common self-polar triangle lies on the line at infinity. Given three separate circles, the line at infinity can be recovered using the...
We propose a nuclear-norm regularized two-dimensional neighborhood preserving projection (2DNPP) for extracting representative 2D image features. Note that 2DNPP extracts neighborhood preserving features through minimizing the reconstruction error, but the Frobenius norm based metric is sensitive to noise and outliers. To make the distance metric more reliable and model the neighborhood reconstruction...
Segmentation of moving objects in a scene is difficult for non-stationary cameras, and especially challenging in the presence of fast and unstable egomotion, e.g., as encountered with car-mounted cameras or wearable devices. Based on an analysis of motion vanishing points of the scene and estimated depth, a geometric model that relates extracted 2D motion to a 3D motion field relative to the camera...
Scale invariance has proven a crucial concept in texture modeling and analysis. Isotropic and self-similar fractional Brownian fields (2D-fBf) are often used as the natural reference process to model scale free textures. Its analysis is standardly conducted using the 2D discrete wavelet transform. Generalizations of 2D-fBf were considered independently in two respects: Anisotropy in the texture is...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.