The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We advocate an approach to activity recognition based on modeling contextual interactions between postured human bodies and nearby objects. We focus on the difficult task of recognizing actions from static images and formulate the problem as a latent structured labeling problem. We develop a unified, discriminative model for such context-based action recognition building on recent techniques for learning...
This paper describes an algorithm enabling a human supervisor to convey task-level information to a robot by using stylus gestures to circle one or more objects within the field of view of a robot-mounted camera. These gestures serve to segment the unknown objects from the environment. Our method's main novelty lies in its use of appearance-based object “reacquisition” to reconstitute the supervisory...
In this article, we introduce a new method for the augmentation of a real object with virtual content under the influence of ambient light. For this purpose, we use a physically-based simulation and spectral data which is recorded separately for each of the components: projector, ambient light and real object. Then the simulation computes the influence of every component to the scene and uses the...
This paper presents a new online multi-classifier boosting algorithm for learning object appearance models. In many cases the appearance model is multi-modal, which we capture by training and updating multiple strong classifiers. The proposed algorithm jointly learns the classifiers and a soft partitioning of the input space, defining an area of expertise for each classifier. We show how this formulation...
Several recent works have explored the benefits of providing more detailed annotations for object recognition. These annotations provide information beyond object names, and allow a detector to reason and describe individual instances in plain English. However, by demanding more specific details from annotators, new difficulties arise, such as stronger language dependencies and limited annotator attention...
Computer vision techniques have been widely applied to immersive and perceptual human-computer interaction for applications like computer gaming, education, and entertainment. In this paper, relevant techniques are surveyed in terms of image capturing, normalization, motion detection, tracking, feature representation and recognition. In addition, applications of vision techniques in HCI in computer...
This paper reviews the concept of straight skeletons, which is well known in computational geometry, and applies it to binary shapes that are used in vision-based shape and object recognition. We devise a novel algorithm for computing discrete straight skeletons from binary input images, which is based on a polygonal approximation of the input shape and a hybrid method that combines continuous and...
Object detection and recognition algorithms are an integral part of the architecture of many modern image processing systems employing Computer Vision (CV) techniques. In this paper we describe our work in the area of segmentation and recognition of simple objects in mobile phone imagery. Given an image of several objects on a structured background, we show how these objects can be segmented efficiently...
In this paper we present a system for mobile augmented reality (AR) based on visual recognition. We split the tasks of recognizing an object and tracking it on the user's screen into a server-side and a client-side task, respectively. The capabilities of this hybrid client-server approach are demonstrated with a prototype application on the Android platform, which is able to augment both stationary...
We address the problem of estimating pose in a static image of a human performing an action that may involve interaction with scene objects. In such scenarios, pose can be estimated more accurately using the knowledge of scene objects. Previous approaches do not make use of such contextual information. We propose Pose Context trees to jointly model human pose and object which allows both accurate...
Human action understanding and analysis for various applications are still in infancy due to various factors. In this paper, for recognizing various complex activities, a combined cue for motion representation and later recognition is demonstrated based on the optical flow-based four directional motion history and basic energy images. Optical flow between consecutive frames are computed to create...
Image encoding using interest points is a common technique in computer vision. In this paper we present a scale and rotation invariant shape centered interest point (SCIP) detector. By means of detecting singularities in Gradient Vector Flow (GVF) fields we find points of high symmetry in the image. Due to the nature of the underlying GVF field we can employ our features to group together edge-based...
Local invariant features have been widely used as fundamental elements for image matching and object recognition. Although dense sampling of local features is useful in achieving an improved performance in image matching and object recognition, it results in increased computational costs for feature extraction. The purpose of this paper is to develop fast computational techniques for extracting local...
We present an extensible platform that integrates state of the art computer vision techniques with mobile communications to deliver a portable visual assistance tool. Live input video from a mobile smartphone is streamed over a 3G or wireless connection while an object recognition engine on a desktop processes the data stream. Recognition results are returned in real-time to the mobile device and...
Visual impairment is a common problem for people worldwide. The projector-based AR technique has the ability to change the appearance of real objects, and it can help to improve visibility for the visually impaired. We propose a new framework for appearance enhancement with a projector camera system that employs a model predictive controller. This framework enables arbitrary image processing such...
In this paper we present several information-theoretic similiarity measures for shape retrieval in combination with non-rigid registration processes. The challenging property of these measures is that they are bypass divergences, that is, do not require the estimation of the probability density function for each shape. After presenting the dissimilarities and proposing some new ones, we analyze their...
Feature matching is a key, underlying component in many approaches to object detection, localization, and recognition. In many cases, feature matching is accomplished by nearest neighbor methods on extracted feature descriptors. This methodology works well for clean, out-of-water images; however, when imaging underwater, even an image of the same object can be drastically different due to varying...
We explore using online learning for selecting the best parameters of Bag of Words systems when searching large scale image collections. We study two algorithms for no regret online learning: Hedge algorithm that works in the full information setting, and Exp3 that works in the bandit setting. We use these algorithms for parameter selection in two scenarios: (a) using a training set to obtain weights...
Empirical evaluation of salient object segmentation methods requires i) a dataset of ground truth object segmentations and ii) a performance measure to compare the output of the algorithm with the ground truth. In this paper, we provide such a dataset, and evaluate 5 distinct performance measures that have been used in the literature practically and psychophysically. Our results suggest that a measure...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.