The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Loop closure detection is important in simultaneous localization and mapping (SLAM) systems. In this paper, Generative Adversarial Networks (GAN), an unsupervised deep architecture is employed to detect the loop closure for vision-based SLAM systems. Instead of extracting handcrafted features like SIFT, SURF or ORB. Generative Adversarial Networks are based on image features. Similar to the task about...
Autonomous Underwater Vehicle (AUV) has limited energy capacity due to it being an embedded system. To overcome this limitation, the AUV can home into a docking station to recharge its battery. Several research has been conducted on the docking of AUV using vision. In some literatures, docking would fail if the target placed at the docking station is missing or disoriented from the camera view. This...
In this paper, we investigate deep neural networks for blind motion deblurring. Instead of regressing for the motion blur kernel and performing non-blind deblurring outside of the network (as most methods do), we propose a compact and elegant end-to-end deblurring network. Inspired by the data-driven sparse-coding approaches that are capable of capturing linear dependencies in data, we generalize...
This paper presents a real-time vision based robot teleoperation system that consists of a three-dimensional (3D) vision subsystem and a slave robot which are connected by LAN. The vision subsystem utilizes an Asus Xtion Pro Live camera to get the 3D data of the operation scene. The vision system is used to determine the position and orientation of a four-ball feature frame held by the operator. Then...
Progress in Multiple Object Tracking (MOT) has been historically limited by the size of the available datasets. We present an efficient framework to annotate trajectories and use it to produce a MOT dataset of unprecedented size. In our novel path supervision the annotator loosely follows the object with the cursor while watching the video, providing a path annotation for each object in the sequence...
As handheld video cameras are now commonplace and available in every smartphone, images and videos can be recorded almost everywhere at anytime. However, taking a quick shot frequently yields a blurry result due to unwanted camera shake during recording or moving objects in the scene. Removing these artifacts from the blurry recordings is a highly ill-posed problem as neither the sharp image nor the...
A popular approach to training classifiers of new image classes is to use lower levels of a pre-trained feed-forward neural network and retrain only the top. Thus, most layers simply serve as highly nonlinear feature extractors. While these features were found useful for classifying a variety of scenes and objects, previous work also demonstrated unusual levels of sensitivity to the input especially...
Flow pattern is one of the most important parameters for gas-liquid two-phase flow. In this work, a new flow pattern identification method based on Convolution Neural Network (CNN) is presented. A 7-layer CNN structure is chosen, and the parameters of this network are determined by a training set. In order to verify the feasibility, experiments were carried out in horizontal pipe with the inner diameter...
Many existing person re-identification (PRID) methods typically attempt to train a faithful global metric offline to cover the enormous visual appearance variations, so as to directly use it online on various probes for identity match- ing. However, their need for a huge set of positive training pairs is very demanding in practice. In contrast to these methods, this paper advocates a different paradigm:...
Scale recovery is one of the central problems for monocular visual odometry. Normally, road plane and camera height are specified as reference to recover the scale. The performances of these methods depend on the plane recognition and height measurement of camera. In this work, we propose a novel method to recover the scale by incorporating the depths estimated from images using deep convolutional...
Neuro-endoscopy is a challenging minimally invasive neurosurgery that requires surgical skills to be acquired using training methods different from the existing apprenticeship model. There are various training systems developed for imparting fundamental technical skills in laparoscopy where as limited systems for neuro-endoscopy. Neuro-Endo-Trainer was a box-trainer developed for endo-nasal transsphenoidal...
Objective: Most trainees begin learning robotic minimally invasive surgery by performing inanimate practice tasks with clinical robots such as the Intuitive Surgical da Vinci. Expert surgeons are commonly asked to evaluate these performances using standardized five-point rating scales, but doing such ratings is time consuming, tedious, and somewhat subjective. This paper presents an automatic skill...
Person re-identification is a topic which has potential to be used for applications within forensics, flow analysis and queue monitoring. It is the process of matching persons across two or more camera views, most often by extracting colour and texture based hand-crafted features, to identify similar persons. Because of challenges regarding changes in lighting between views, occlusion or even privacy...
With the advent of low-cost RGBD sensors, many solutions have been proposed for extraction and fusion of colour and depth information. In this paper, we propose new different fusion approaches of these multimodal sources for people detection. We are especially concerned by a scenario where a robot evolves in a changing environment. We extend the use of the Faster RCNN framework proposed by Girshick...
For measuring and creating a complex physical protection system of a facility, it is important to take into consideration its characteristics not only inside the fence, but it is inevitable to examine, classify its surrounding objects as well. In other words, it is necessary to evaluate risk factors lying outside accurately in the light of risk analysis. The security leader broadens the knowledge...
We propose a neural network architecture for depth map inference from monocular stabilized videos with application to UAV videos in rigid scenes. Training is based on a novel synthetic dataset for navigation that mimics aerial footage from gimbal stabilized monocular camera in rigid scenes. Based on this network, we propose a multi-range architecture for unconstrained UAV flight, leveraging flight...
This paper proposed a monocular vehicle detection for forward collision warning system. We use the active-learning framework to train a cascade classifier and use a two steps vehicle detection. We used five test data to quantify our detection performance, analyzing the two-stage vehicle detection improvement, and the overall detection rate and the false detection rate. In a good light condition, the...
The paper presents an approach to localize human body joints in 3D coordinates based on a single low resolution depth image. First a framework to generate a database of 80k realistic depth images from a 3D body model is described. Then data preprocessing and normalization procedure, and DNN and MLP artificial neural networks architectures and training are presented. The robustness against camera distance...
To quickly and efficiently analyze a large-scale environment by the camera with limited field-of-view, intelligent systems should sequentially select the optimal field-of-view to observe an important and informative patch of area. Especially in the image retrieval task, small observations should be sequentially selected to increase the performance of image retrieval and the updated performance can...
The major challenges for optical based tracking are the lighting condition, the similarity of the scene, and the position of the camera. This paper demonstrates that under such conditions, the positioning accuracy of Google's Tango platform may deteriorate from fine-grained centimetre level to metre level. The paper proposes a particle filter based approach to fuse the WiFi signal and the magnetic...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.