The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We address the problem of joint detection and segmentation of multiple object instances in an image, a key step towards scene understanding. Inspired by data-driven methods, we propose an exemplar-based approach to the task of multi-instance segmentation using a small set of annotated reference images. We design a novel CRF model that jointly models object appearance, shape deformation, and object...
This paper introduces a novel method for categorical image labeling, where each pixel is uniquely assigned to one of a set of unordered, discrete labels. Starting from provided label-depending pixel likelihoods we (a) exploit a segment hierarchy as spatial support to define powerful priors and (b) introduce an efficient and effective inference method, that can be implemented in a few lines of code...
In this paper we give a convex optimization approach for scene understanding. Since segmentation, object recognition and scene labeling strongly benefit from each other we propose to solve these tasks within a single convex optimization problem. In contrast to previous approaches we do not rely on pre-processing techniques such as object detectors or super pixels. The central idea is to integrate...
In this paper, we introduce the concept of proximity priors into semantic segmentation in order to discourage the presence of certain object classes (such as 'sheep' and 'wolf') 'in the vicinity' of each other. 'Vicinity' encompasses spatial distance as well as specific spatial directions simultaneously, e.g. 'plates' are found directly above 'tables', but do not fly over them. In this sense, our...
In this paper, we propose a novel saliency-aware stereo images segmentation approach using the high-order energy items, which utilizes the disparity map and statistical information of stereo images to enrich the high-order potentials. To the best of our knowledge, our approach is first one to formulate the automatic stereo cut as the high-order energy optimization problems, which simultaneously segments...
This paper describes a method for accurately interpolating a low-resolution depth image using a high-resolution color image. In our method, first, tangent planes on each super pixel are estimated from the sparse depth information and dense color information. Then, the neighboring super pixels that have smooth-connectable tangent planes are connected, and the image segmentation to smooth surfaces are...
We present a method to convert a digital single-lens-reflex (DSLR) camera into a high resolution consumer depth and light field camera by affixing an external aperture mask to the main lens. Compared to the existing consumer depth and light field cameras, our camera is easy to construct with minimal additional costs and our design is camera and lens agnostic. The main advantage of our design is the...
This paper presents an adaptive cooperative approach towards the 3D reconstruction tailored for a bio-inspired depth camera: the stereo dynamic vision sensor (DVS). DVS consists of self-spiking pixels that asynchronously generate events upon relative light intensity changes. These sensors have the advantage to allow simultaneously high temporal resolution (better than 10µs) and wide dynamic range...
Planes are dominant in most indoor and outdoor scenes and the development of a hybrid algorithm that incorporates both point and plane features provides numerous advantages. In this regard, we present a tracking algorithm for RGB-D cameras using both points and planes as primitives. We show how to extend the standard prediction-and-correction framework to include planes in addition to points. By fitting...
Reliable and timely detection of abandoned items in public places still represents an unsolved problem for automated visual surveillance. Typical surveilled scenarios are associated with high visual ambiguity such as shadows, occlusions, illumination changes and substantial clutter consisting of a mixture of dynamic and stationary objects. Motivated by these challenges we propose a reliable left item...
We present a method for producing an accurate and compact 3-D face model in real time using a low cost RGB-D sensor like the Kinect camera. We extend and use Bump Images for highly accurate and low memory consumption 3-D reconstruction of the human face. Bump Images are generated by representing the Cartesian coordinates of points on the face in the spherical coordinate system whose origin is the...
Detection of positive and negative emotions can provide an insight into the person's level of satisfaction, social responsiveness and clues like the need for help. Therefore, automatic perception of affect valence is a key for novel human-computer interaction applications. However, robust recognition with conventional 2D cameras is still not possible in realistic conditions, in the presence of high...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.