The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper presents a novel self-similarity based approach for the problem of vanishing point estimation in man-made scenes. A vanishing point (VP) is the convergence point of a pencil (a concurrent line set), that is a perspective projection of a corresponding parallel line set in the scene. Unlike traditional VP detection that relies on extraction and grouping of individual straight lines, our approach...
Scarcity and infeasibility of human supervision for large scale multi-class classification problems necessitates active learning. Unfortunately, existing active learning methods for multi-class problems are inherently binary methods and do not scale up to a large number of classes. In this paper, we introduce a probabilistic variant of the K-nearest neighbor method for classification that can be seamlessly...
This paper discusses the question: Can we improve the recognition of objects by using their spatial context? We start from Bag-of-Words models and use the Pascal 2007 dataset. We use the rough object bounding boxes that come with this dataset to investigate the fundamental gain context can bring. Our main contributions are: (I) The result of Zhang et al. in CVPR07 that context is superfluous derived...
We propose an approach to overcome the two main challenges of 3D multiview object detection and localization: The variation of object features due to changes in the viewpoint and the variation in the size and aspect ratio of the object. Our approach proceeds in three steps. Given an initial bounding box of fixed size, we first refine its aspect ratio and size. We can then predict the viewing angle,...
In this work, we present a non-rigid approach to jointly solve the tasks of 2D-3D pose estimation and 2D image segmentation. In general, most frameworks which couple both pose estimation and segmentation assume that one has the exact knowledge of the 3D object. However, in non-ideal conditions, this assumption may be violated if only a general class to which a given shape belongs to is given (e.g...
Various powerful people detection methods exist. Surprisingly, most approaches rely on static image features only despite the obvious potential of motion information for people detection. This paper systematically evaluates different features and classifiers in a sliding-window framework. First, our experiments indicate that incorporating motion information improves detection performance significantly...
In this paper we consider the problem of object parsing, namely detecting an object and its components by composing them from image observations. Apart from object localization, this involves the question of combining top-down (model-based) with bottom-up (image-based) information. We use an hierarchical object model, that recursively decomposes an object into simple structures. Our first contribution...
In this work we propose a convex relaxation approach for computing minimal partitions. Our approach is based on rewriting the minimal partition problem (also known as Potts model) in terms of a primal dual Total Variation functional. We show that the Potts prior can be incorporated by means of convex constraints on the dual variables. For minimization we propose an efficient primal dual projected...
Markov random field (MRF, CRF) models are popular in computer vision. However, in order to be computationally tractable they are limited to incorporate only local interactions and cannot model global properties, such as connectedness, which is a potentially useful high-level prior for object segmentation. In this work, we overcome this limitation by deriving a potential function that enforces the...
Symmetry is an important cue for machine perception that involves high-level knowledge of image components. Unlike most of the previous research that only computes symmetry in an image, this paper integrates symmetry with image segmentation to improve the segmentation performance. The symmetry integration is used to optimize both the segmentation and the symmetry of regions simultaneously. Interesting...
We consider regions of images that exhibit smooth statistics, and pose the question of characterizing the "essence" of these regions that matters for recognition. Ideally, this would be a statistic (a function of the image) that does not depend on viewpoint and illumination, and yet is sufficient for the task. In this manuscript, we show that such statistics exist. That is, one can compute...
Accurately identifying corresponded landmarks from a population of shape instances is the major challenge in constructing statistical shape models. In general, shape-correspondence methods can be grouped into one of two categories: global methods and pair-wise methods. In this paper, we develop a new method that attempts to address the limitations of both the global and pair-wise methods. In particular,...
The aim of this work is to learn a shape prior model for an object class and to improve shape matching with the learned shape prior. Given images of example instances, we can learn a mean shape of the object class as well as the variations of non-affine and affine transformations separately based on the thin plate spline (TPS) parameterization. Unlike previous methods, for learning, we represent shapes...
In this paper, we introduce a novel algorithm to solve global shape registration problems. We use gray-scale ldquoimagesrdquo to represent source shapes, and propose a novel two-component Gaussian Mixtures (GM) distance map representation for target shapes. Based on this flexible asymmetric image-based representation, a new energy function is defined. It proves to be a more robust shape dissimilarity...
Graph matching is an important problem in computer vision. It is used in 2D and 3D object matching and recognition. Despite its importance, there is little literature on learning the parameters that control the graph matching problem, even though learning is important for improving the matching rate, as shown by this and other work. In this paper we show for the first time how to perform parameter...
We present a new method for classification with structured latent variables. Our model is formulated using the max-margin formalism in the discriminative learning literature. We propose an efficient learning algorithm based on the cutting plane method and decomposed dual optimization. We apply our model to the problem of recognizing human actions from video sequences, where we model a human action...
Discovering the underlying low-dimensional latent structure in high-dimensional perceptual observations (e.g., images, video) can, in many cases, greatly improve performance in recognition and tracking. However, non-linear dimensionality reduction methods are often susceptible to local minima and perform poorly when initialized far from the global optimum, even when the intrinsic dimensionality is...
Feature selection plays a fundamental role in many pattern recognition problems. However, most efforts have been focused on the supervised scenario, while unsupervised feature selection remains as a rarely touched research topic. In this paper, we propose manifold-based maximum margin feature selection (M3FS) to select the most discriminative features for clustering. M3FS targets to find those features...
Multi-instance multi-label learning (MIML) refers to the learning problems where each example is represented by a bag/collection of instances and is labeled by multiple labels. An example application of MIML is visual object recognition in which each image is represented by multiple key points (i.e., instances) and is assigned to multiple object categories. In this paper, we study the problem of learning...
Many problems in computer vision can be modeled using conditional Markov random fields (CRF). Since finding the maximum a posteriori (MAP) solution in such models is NP-hard, much attention in recent years has been placed on finding good approximate solutions. In particular, graph-cut based algorithms, such as a-expansion, are tremendously successful at solving problems with regular potentials. However,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.