The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
User feedback helps an image search system refine its relevance predictions, tailoring the search towards the user's preferences. Existing methods simply take feedback at face value: clicking on an image means the user wants things like it, commenting that an image lacks a specific attribute means the user wants things that have it. However, we expect there is actually more information behind the...
Many object detectors, such as AdaBoost, SVM and deformable part-based models (DPM), compute additive scoring functions at a large number of windows scanned over image pyramid, thus computational efficiency is an important consideration beside accuracy performance. In this paper, we present a framework of learning cost-sensitive decision policy which is a sequence of two-sided thresholds to execute...
Geometry and geography can play an important role in recognition tasks in computer vision. To aid in studying connections between geometry and recognition, we introduce NYC3DCars, a rich dataset for vehicle detection in urban scenes built from Internet photos drawn from the wild, focused on densely trafficked areas of New York City. Our dataset is augmented with detailed geometric and geographic information,...
Domain-invariant representations are key to addressing the domain shift problem where the training and test examples follow different distributions. Existing techniques that have attempted to match the distributions of the source and target domains typically compare these distributions in the original feature space. This space, however, may not be directly suitable for such a comparison, since some...
Computing distances between large sets of SIFT descriptors is a basic step in numerous algorithms in computer vision. When the number of descriptors is large, as is often the case, computing these distances can be extremely time consuming. In this paper we propose the SIFT pack: a compact way of storing SIFT descriptors, which enables significantly faster calculations between sets of SIFTs than the...
We describe Photo OCR, a system for text extraction from images. Our particular focus is reliable text extraction from smartphone imagery, with the goal of text recognition as a user input modality similar to speech recognition. Commercially available OCR performs poorly on this task. Recent progress in machine learning has substantially improved isolated character classification, we build on this...
We propose an unsupervised detector adaptation algorithm to adapt any offline trained face detector to a specific collection of images, and hence achieve better accuracy. The core of our detector adaptation algorithm is a probabilistic elastic part (PEP) model, which is offline trained with a set of face examples. It produces a statistically aligned part based face representation, namely the PEP representation...
In multi-label image annotations, because each image is associated to multiple categories, the semantic terms (label classes) are not mutually exclusive. Previous research showed that such label correlations can largely boost the annotation accuracy. However, all existing methods only directly apply the label correlation matrix to enhance the label inference and assignment without further learning...
Capturing the essential characteristics of visual objects by considering how their features are inter-related is a recent philosophy of object classification. In this paper, we embed this principle in a novel image descriptor, dubbed Heterogeneous Auto-Similarities of Characteristics (HASC). HASC is applied to heterogeneous dense features maps, encoding linear relations by co variances and nonlinear...
This paper presents a hierarchical method for finding correspondences in non-rigid shapes. We propose a new representation for 3D meshes: the decomposition tree. This structure characterizes the recursive decomposition process of a mesh into regions of interest and key points. The internal nodes contain regions of interest (which may be recursively decomposed) and the leaf nodes contain the key points...
Many computer vision problems have an asymmetric distribution of information between training and test time. In this work, we study the case where we are given additional information about the training data, which however will not be available at test time. This situation is called learning using privileged information (LUPI). We introduce two maximum-margin techniques that are able to make use of...
We propose an on-line algorithm to extract a human by foreground/background segmentation and estimate pose of the human from the videos captured by moving cameras. We claim that a virtuous cycle can be created by appropriate interactions between the two modules to solve individual problems. This joint estimation problem is divided into two sub problems, foreground/background segmentation and pose...
In this paper, we develop a generative model to describe the layouts of outdoor scenes - the spatial configuration of regions. Specifically, the layout of an image is represented as a composite of regions, each associated with a semantic topic. At the heart of this model is a novel stochastic process called Spatial Topic Process, which generates a spatial map of topics from a set of coupled Gaussian...
Joint segmentation of image sets has great importance for object recognition, image classification, and image retrieval. In this paper, we aim to jointly segment a set of images starting from a small number of labeled images or none at all. To allow the images to share segmentation information with each other, we build a network that contains segmented as well as unsegmented images, and extract functional...
We present a hybrid parametric and nonparametric algorithm, exemplar cut, for generating class-specific object segmentation hypotheses. For the parametric part, we train a pylon model on a hierarchical region tree as the energy function for segmentation. For the nonparametric part, we match the input image with each exemplar by using regions to obtain a score which augments the energy function from...
Statistical shape analysis develops methods for comparisons, deformations, summarizations, and modeling of shapes in given data sets. These tasks require a fundamental tool called parallel transport of tangent vectors along arbitrary paths. This tool is essential for: (1) computation of geodesic paths using either shooting or path-straightening method, (2) transferring deformations across objects,...
In this paper, we propose a novel perception-based shape decomposition method which aims to decompose a shape into semantically meaningful parts. In addition to three popular perception rules (the Minima rule, the Short-cut rule and the Convexity rule) in shape decomposition, we propose a new rule named part-similarity rule to encourage consistent partition of similar parts. The problem is formulated...
One fundamental assumption in object recognition as well as in other computer vision and pattern recognition problems is that the data generation process lies on a manifold and that it respects the intrinsic geometry of the manifold. This assumption is held in several successful algorithms for diffusion and regularization, in particular, in graph-Laplacian-based algorithms. We claim that the performance...
Dimensionality reduction for vectors in sequences is challenging since labels are attached to sequences as a whole. This paper presents a model-based dimensionality reduction method for vector sequences, namely linear sequence discriminant analysis (LSDA), which attempts to find a subspace in which sequences of the same class are projected together while those of different classes are projected as...
Over the last years, several authors have signaled that state of the art categorization methods fail to perform well when trained and tested on data from different databases. The general consensus in the literature is that this issue, known as domain adaptation and/or dataset bias, is due to a distribution mismatch between data collections. Methods addressing it go from max-margin classifiers to learning...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.