The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We address the problem of determining correspondences between two images in agreement with a geometric model such as an affine or thin-plate spline transformation, and estimating its parameters. The contributions of this work are three-fold. First, we propose a convolutional neural network architecture for geometric matching. The architecture is based on three main components that mimic the standard...
This paper focuses on semantic scene completion, a task for producing a complete 3D voxel representation of volumetric occupancy and semantic labels for a scene from a single-view depth map observation. Previous work has considered scene completion and semantic labeling of depth maps separately. However, we observe that these two problems are tightly intertwined. To leverage the coupled nature of...
Matching local geometric features on real-world depth images is a challenging task due to the noisy, low-resolution, and incomplete nature of 3D scan data. These difficulties limit the performance of current state-of-art methods, which are typically based on histograms over geometric properties. In this paper, we present 3DMatch, a data-driven model that learns a local volumetric patch descriptor...
This paper proposes a new extrinsic calibration of kaleidoscopic imaging system by estimating normals and distances of the mirrors. The problem to be solved in this paper is a simultaneous estimation of all mirror parameters consistent throughout multiple reflections. Unlike conventional methods utilizing a pair of direct and mirrored images of a reference 3D object to estimate the parameters on a...
3D shape models are naturally parameterized using vertices and faces, i.e., composed of polygons forming a surface. However, current 3D learning paradigms for predictive and generative tasks using convolutional neural networks focus on a voxelized representation of the object. Lifting convolution operators from the traditional 2D to 3D results in high computational overhead with little additional...
This paper presents a novel algorithm that utilizes a 2D floorplan to align panorama RGBD scans. While effective panorama RGBD alignment techniques exist, such a system requires extremely dense RGBD image sampling. Our approach can significantly reduce the number of necessary scans with the aid of a floorplan image. We formulate a novel Markov Random Field inference problem as a scan placement over...
We propose a novel, practical solution for high quality reconstruction of axially-symmetric transparent objects. While a special case, such transparent objects are ubiquitous in the real world. Common examples of these are glasses, goblets, tumblers, carafes, etc., that can have very unique and visually appealing forms making their reconstruction interesting for vision and graphics applications. Our...
In this paper, we establish a theoretical connection between the classical Lucas & Kanade (LK) algorithm and the emerging topic of Spatial Transformer Networks (STNs). STNs are of interest to the vision and learning communities due to their natural ability to combine alignment and classification within the same theoretical framework. Inspired by the Inverse Compositional (IC) variant of the LK...
In this paper we study the problem of automatically generating polynomial solvers for minimal problems. The main contribution is a new method for finding small elimination templates by making use of the syzygies (i.e. the polynomial relations) that exist between the original equations. Using these syzygies we can essentially parameterize the set of possible elimination templates. We evaluate our method...
Structure-from-Motion (SfM) methods can be broadly categorized as incremental or global according to their ways to estimate initial camera poses. While incremental system has advanced in robustness and accuracy, the efficiency remains its key challenge. To solve this problem, global reconstruction system simultaneously estimates all camera poses from the epipolar geometry graph, but it is usually...
Generation of 3D data by deep neural network has been attracting increasing attention in the research community. The majority of extant works resort to regular representations such as volumetric grids or collection of images, however, these representations obscure the natural invariance of 3D shapes under geometric transformations, and also suffer from a number of other issues. In this paper we address...
The rational camera model recently introduced in [18] provides a general methodology for studying abstract nonlinear imaging systems and their multi-view geometry. This paper builds on this framework to study physical realizations of rational cameras. More precisely, we give an explicit account of the mapping between between physical visual rays and image points (missing in the original description),...
A minimal solution using two affine correspondences is presented to estimate the common focal length and the fundamental matrix between two semi-calibrated cameras – known intrinsic parameters except a common focal length. To the best of our knowledge, this problem is unsolved. The proposed approach extends point correspondence-based techniques with linear constraints derived from local...
Previous approaches for scene text detection have already achieved promising performances across various benchmarks. However, they usually fall short when dealing with challenging scenarios, even when equipped with deep neural network models, because the overall performance is determined by the interplay of multiple stages and components in the pipelines. In this work, we propose a simple yet powerful...
Despite significant progress of deep learning in recent years, state-of-the-art semantic matching methods still rely on legacy features such as SIFT or HoG. We argue that the strong invariance properties that are key to the success of recent deep architectures on the classification task make them unfit for dense correspondence tasks, unless a large amount of supervision is used. In this work, we propose...
Spectral Clustering is one of pioneered clustering methods in machine learning and pattern recognition field. It relies on the spectral decomposition criterion to learn a low-dimensonal embedding of data for a basic clustering algorithm such as the k-means. The recent sparse Spectral clustering (SSC) introduces the sparsity for the similarity in low-dimensional space by enforcing a sparsity-induced...
We present a method for synthesizing a frontal, neutral-expression image of a persons face, given an input face photograph. This is achieved by learning to generate facial landmarks and textures from features extracted from a facial-recognition network. Unlike previous generative approaches, our encoding feature vector is largely invariant to lighting, pose, and facial expression. Exploiting this...
We present a new insight into the systematic generation of minimal solvers in computer vision, which leads to smaller and faster solvers. Many minimal problem formulations are coupled sets of linear and polynomial equations where image measurements enter the linear equations only. We show that it is useful to solve such systems by first eliminating all the unknowns that do not appear in the linear...
Blind deconvolution is the problem of recovering a convolutional kernel and an activation signal from their convolution y = a0 * x0. This problem is ill-posed without further constraints or priors. This paper studies the situation where the nonzero entries in the activation signal are sparsely and randomly populated. We normalize the convolution kernel to have unit Frobenius norm and cast the sparse...
3D Reconstruction from shading information through Photometric Stereo is considered a very challenging problem in Computer Vision. Although this technique can potentially provide highly detailed shape recovery, its accuracy is critically dependent on a numerous set of factors among them the reliability of the light sources in emitting a constant amount of light. In this work, we propose a novel variational...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.