The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Stereo matching commonly requires rectified images that are computed from calibrated cameras. Since all underlying parametric camera models are only approximations, calibration and rectification will never be perfect. Additionally, it is very hard to keep the calibration perfectly stable in application scenarios with large temperature changes and vibrations. We show that even small calibration errors...
Radiometric variations between input images can seriously degrade the performance of stereo matching algorithms. In this situation, mutual information is a very popular and powerful measure which can find any global relationship of intensities between two input images taken from unknown sources. The mutual information-based method, however, is still ambiguous or erroneous as regards local radiometric...
We present a new approach to iteratively estimate both high-quality depth map and alpha matte from a single image or a video sequence. Scene depth, which is invariant to illumination changes, color similarity and motion ambiguity, provides a natural and robust cue for foreground/ background segmentation - a prerequisite for matting. The image mattes, on the other hand, encode rich information near...
In this paper, we propose a novel approach for learning generic visual vocabulary. We use diffusion maps to automatically learn a semantic visual vocabulary from abundant quantized midlevel features. Each midlevel feature is represented by the vector of pointwise mutual information (PMI). In this midlevel feature space, we believe the features produced by similar sources must lie on a certain manifold...
This paper presents a novel approach to skim and describe 3D videos. 3D video is an imaging technology which consists in a stream of 3D models in motion captured by a synchronized set of video cameras. Each frame is composed of one or several 3D models, and therefore the acquisition of long sequences at video rate requires massive storage devices. In order to reduce the storage cost while keeping...
We present a new approach for the discriminative training of continuous-valued Markov Random Field (MRF) model parameters. In our approach we train the MRF model by optimizing the parameters so that the minimum energy solution of the model is as similar as possible to the ground-truth. This leads to parameters which are directly optimized to increase the quality of the MAP estimates during inference...
We propose a novel formulation of stereo matching that considers each pixel as a feature vector. Under this view, matching two or more images can be cast as matching point clouds in feature space. We build a nonparametric depth smoothness model in this space that correlates the image features and depth values. This model induces a sparse graph that links pixels with similar features, thereby converting...
Spatiotemporal stereo is concerned with the recovery of the 3D structure of a dynamic scene from a temporal sequence of multiview images. This paper presents a novel method for computing temporally coherent disparity maps from a sequence of binocular images through an integrated consideration of image spacetime structure and without explicit recovery of motion. The approach is based on matching spatiotemporal...
We propose an algorithm that simultaneously extracts disparities and alpha matting information given a stereo image pair. Our method divides the reference image into a set of overlapping, partially transparent color segments. Each segment pixel is assigned an alpha value and color. The disparity inside the segment is modeled via a plane. The goodness of alphas, colors and disparity planes is measured...
This paper presents a technique for reconstructing a high-quality high dynamic range (HDR) image from a set of differently exposed and possibly blurred images taken with a hand-held camera. Recovering an HDR image from differently exposed photographs has become very popular. However, it often requires a tripod to keep the camera still when taking photographs of different exposures. To ease the process,...
We propose an approach to restore severely degraded document images using a probabilistic context model. Unlike traditional approaches that use previously learned prior models to restore an image, we are able to learn the text model from the degraded document itself, making the approach independent of script, font, style, etc. We model the contextual relationship using an MRF. The ability to work...
When imaging in scattering media there is a need to enhance visibility. Some approaches have used polarized images in this context with apparent success. These methods take advantage of the fact that the path radiance (air light) is partially polarized. However, mounting a polarizer attenuates the signal associated with the object. This attenuation degrades the image quality. Thus, a question arises:...
This paper describes a novel graphical model approach to seamlessly coupling and simultaneously analyzing facial emotions and the action units. Our method is based on the hidden conditional random fields (HCRFs) where we link the output class label to the underlying emotion of a facial expression sequence, and connect the hidden variables to the image frame-wise action units. As HCRFs are formulated...
We present a new multi-stage algorithm for car and truck detection from a moving vehicle. The algorithm performs a search for pertinent features in three dimensions, guided by a ground plane and lane boundary estimation sub-system, and assembles these features into vehicle hypotheses. A number of classifiers are applied to the hypotheses in order to remove false detections. Quantitative analysis on...
Current work in object categorization discriminates among objects that typically possess gross differences which are readily apparent. However, many applications require making much finer distinctions. We address an insect categorization problem that is so challenging that even trained human experts cannot readily categorize images of insects considered in this paper. The state of the art that uses...
We propose a new bilateral filtering algorithm with computational complexity invariant to filter kernel size, so-called O(1) or constant time in the literature. By showing that a bilateral filter can be decomposed into a number of constant time spatial filters, our method yields a new class of constant time bilateral filters that can have arbitrary spatial and arbitrary range kernels. In contrast,...
Color is a powerful visual cue for many computer vision applications such as image segmentation and object recognition. However, most of the existing color models depend on the imaging conditions affecting negatively the performance of the task at hand. Often, a reflection model (e.g., Lambertian or dichromatic reflectance) is used to derive color invariant models. However, those reflection models...
A texture descriptor is proposed, which combines local highly discriminative features with the global statistics of fractal geometry to achieve high descriptive power, but also invariance to geometric and illumination transformations. As local measurements SIFT features are estimated densely at multiple window sizes and discretized. On each of the discretized measurements the fractal dimension is...
Edge-based color constancy makes use of image derivatives to estimate the illuminant. However, different edge types exist in real-world images such as shadow, geometry, material and highlight edges. These different edge types may have a distinctive influence on the performance of the illuminant estimation.
Boosted one-versus-all (OVA) classifiers are commonly used in multiclass problems, such as generic object recognition, biometrics-based identification, or gesture recognition. JointBoost is a recently proposed method where OVA classifiers are trained jointly and are forced to share features. JointBoost has been demonstrated to lead both to higher accuracy and smaller classification time, compared...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.