The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
For robots of the future to interact seamlessly with humans, they must be able to reason about their surroundings and take actions that are appropriate to the situation. Such reasoning is only possible when the robot has knowledge of how the World functions, which must either be learned or hard-coded. In this paper, we propose an approach that exploits language as an important resource of high-level...
Robot vision became a field of increasing importance in micro aerial vehicle robotics with the availability of small and light hardware. While most approaches rely on external ground stations because of the need of high computational power, we will present a full autonomous setup using only on-board hardware. Our work is based on the continuous homography constraint to recover ego-motion from optical...
In this paper we investigate the effectiveness of SURF features for visual terrain classification for outdoor flying robots. A quadrocopter fitted with a single camera is flown over different terrains to take images of the ground below. Each image is divided into a grid and SURF features are calculated at grid intersections. A classifier is then used to learn to differentiate between different terrain...
Here we present an approach to estimate the global pose of a vehicle in the face of two distinct problems; first, when using stereo visual odometry for relative motion estimation, a lack of features at close range causes a bias in the motion estimate. The other challenge is localizing in the global coordinate frame using very infrequent GPS measurements. Solving these problems we demonstrate a method...
Our work presents solutions to two related vexing problems in feature-based localization of ground targets in Unmanned Aerial Vehicle (UAV) images: (i) A good initial guess at the pose estimate that would speed up the convergence to the final pose estimate for each image frame in a video sequence; and (ii)Time-bounded estimation of the position of the ground target. We address both these problems...
This paper presents a hierarchal, two-layer, connectionist-based human-action recognition system (CHARS) as a first step towards developing socially intelligent robots. The first layer is a K-nearest neighbor (K-NN) classifier that categorizes human actions into two classes based on the existence of locomotion, and the second layer consists of two multi-layer recurrent neural networks that distinguish...
In this paper, we introduce the concept of dense scene flow for visual SLAM applications. Traditional visual SLAM methods assume static features in the environment and that a dominant part of the scene changes only due to camera egomotion. These assumptions make traditional visual SLAM methods prone to failure in crowded real-world dynamic environments with many independently moving objects, such...
In this work we address the problem of feature extraction for object recognition in the context of cameras providing RGB and depth information (RGB-D data). We consider this problem in a bag of features like setting and propose a new, learned, local feature descriptor for RGB-D images, the convolutional k-means descriptor. The descriptor is based on recent results from the machine learning community...
We propose a view-based approach for labeling objects in 3D scenes reconstructed from RGB-D (color+depth) videos. We utilize sliding window detectors trained from object views to assign class probabilities to pixels in every RGB-D frame. These probabilities are projected into the reconstructed 3D scene and integrated using a voxel representation. We perform efficient inference on a Markov Random Field...
Humans can perform fast and skillful manipulations using various parts of the body by effectively utilizing the dynamics of the targets. Visual sensation is the most important human sense used for such manipulations. Juggling is one such example involving skillful and dynamic manipulations, and visual information is essential for it to be successful. Previously, there have been several studies about...
Autonomous vehicles must be capable of localizing even in GPS denied situations. In this paper, we propose a real-time method to localize a vehicle along a route using visual imagery or range information. Our approach is an implementation of topometric localization, which combines the robustness of topological localization with the geometric accuracy of metric methods. We construct a map by navigating...
Most of the existing appearance based topological mapping algorithms produce dense topological maps in which each image stands as a node in the topological graph. Sparser maps can be built by representing groups of visually similar images as nodes of a topological graph. In this paper, we present a sparse topological mapping framework which uses Image Sequence Partitioning (ISP) techniques to group...
Appearance based maps are emerging as an important class of spatial representations for mobile robots. In this paper we tackle the problem of merging together two or more appearance based maps independently built by robots operating in the same environment. Noticing the lack of well accepted metrics to measure the performance of map merging algorithms, we propose to use algebraic connectivity as a...
An RGB-D camera is a sensor which outputs range and color information about objects. Recent technological advances in this area have introduced affordable RGB-D devices in the robotics community. In this paper, we present a real-time technique for 6-DoF camera pose estimation through the incremental registration of RGB-D images. First, a set of edge features are computed from the depth and color images...
We present an approach to simultaneous localization and mapping (SLAM) for RGB-D cameras like the Microsoft Kinect. Our system concurrently estimates the trajectory of a hand-held Kinect and generates a dense 3D model of the environment. We present the key features of our approach and evaluate its performance thoroughly on a recently published dataset, including a large set of sequences of different...
Learning about new objects that a robot sees for the first time is a difficult problem because it is not clear how to define the concept of object in general terms. In this paper we consider as objects those physical entities that are comprised of features which move consistently when the robot acts upon them. Among the possible actions that a robot could apply to a hypothetical object, pushing seems...
Place categorization and object recognition are competencies needed by robots to perform a variety of service tasks in the home, such as fetch-and-carry, retrieval, cleaning, meal preparation, and companionship. Context is a powerful cue for place categorization and object recognition; rooms are laid out in a specific fashion to enable comfortable and efficient living, and objects are used within...
Detecting objects in shadows is a challenging task in computer vision. For example, in clear path detection application, strong shadows on the road confound the detection of the boundary between clear path and obstacles, making clear path detection algorithms less robust. Shadow removal, relies on the classification of edges as shadow edges or non-shadow edges. We present an algorithm to detect strong...
Early fire detection is crucial to minimise damage and save lives. Video surveillance smoke detectors do not suffer from transport delays and can cover large areas. The smoke detection on images is, however, a difficult problem due the variability of smoke density, lighting conditions, background clutter, and unstable patterns. In order to solve this problem, we propose a novel unsupervised object...
This article addresses the problem of imagebased localization in indoor environments. The localization is achieved by querying a database of omnidirectional images that constitutes a detailed visual map of the building where the robot operates. Omnidirectional cameras have the advantage, when compared to standard perspectives, of capturing in a single frame the entire visual content of a room. This,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.