The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper proposes and implements a convolutional neural network (CNN) that maps images from a camera to an error signal to guide and control an autonomous underwater vehicle into the entrance of a docking station. The paper proposes to use an external positioning system synchronized with the vehicle to obtain a dataset of images matched with the position and orientation of the vehicle. By using...
In this paper, a unified deep convolutional architecture is proposed to address the problems in the person re-identification task. The proposed method adaptively learns the discriminative deep mid-level features of a person and constructs the correspondence features between an image pair in a data-driven manner. The previous Siamese structure deep learning approaches focus only on pair-wise matching...
Welding is a process recognized by the laborious work and hazardous work environment it takes place, but it is an important process in different industrial scenarios, like the shipbuilding industry. The use of robots has been increasing in recent years, reducing the human interference necessary for the process. This paper proposes a system for automated seam tracking and a geometric welding bead analysis...
Periodontal diseases are the largest cause of tooth loss among people of all ages and are also correlated with systemic diseases such as endocarditis. Advanced periodontal disease comprises degradation of surrounding tooth structures, severe inflammation and gingival bleeding. Inflammation is an early indicator of periodontal disease. Early detection and preventive measures can help prevent serious...
In supervised machine learning applications, a data set of training and validation features and labels is required to train a neural network. In this paper, we present a remote-controlled, mobile robot and describe software used to generate a data set for vision-based, supervised machine learning applications. We present results from an experiment, which validates the developed platform, and also...
Viewpoint variation is a major challenge in video- based human action recognition. We exploit the simultaneous RGB and Depth sensing of RGB-D cameras to address this problem. Our technique capitalizes on the complementary spatio-temporal information in RGB and Depth frames of the RGB-D videos to achieve viewpoint invariant action recognition. We extract view invariant features from the dense trajectories...
To assist the social interaction of deaf and hearing impaired people, efficient interactive communication tools is expected. With the growing research interest in action and gesture recognition in the last years, many successful applications for sign language recognition comprise new types of sensors including low-cost depth camera and advanced machine learning technologies. In this paper, we present...
In recent years, remarkable breakthrough has been achieved in person re-identification (Re-ID). However most methods are only tested in the closed-world setting where the probe person is assumed to be one of the gallery people. In this paper, we tackle a more realistic problem, open-world Re-ID, which requires to find out whether the probe person is among the gallery or not, and if so, who he is....
Identifying object in a dynamic scene is one of the main problems in computer vision. This is directly related to solving recognition problem for dynamic texture. Recognizing dynamic texture has become a fundamental problem to understand natural video content. It is a powerful technique for recognizing natural scenes such as fire, waves and smoke. Methods which exist today suffer from various problems...
This paper proposes an end-to-end learning framework for multiview stereopsis. We term the network SurfaceNet. It takes a set of images and their corresponding camera parameters as input and directly infers the 3D model. The key advantage of the framework is that both photo-consistency as well geometric relations of the surface structure can be directly learned for the purpose of multiview stereopsis...
The intensive annotation cost and the rich but unlabeled data contained in videos motivate us to propose an unsupervised video-based person re-identification (re-ID) method. We start from two assumptions: 1) different video tracklets typically contain different persons, given that the tracklets are taken at distinct places or with long intervals; 2) within each tracklet, the frames are mostly of the...
In this work we propose a new CNN+LSTM architecture for camera pose regression for indoor and outdoor scenes. CNNs allow us to learn suitable feature representations for localization that are robust against motion blur and illumination changes. We make use of LSTM units on the CNN output, which play the role of a structured dimensionality reduction on the feature vector, leading to drastic improvements...
While metric learning is important for Person reidentification (RE-ID), a significant problem in visual surveillance for cross-view pedestrian matching, existing metric models for RE-ID are mostly based on supervised learning that requires quantities of labeled samples in all pairs of camera views for training. However, this limits their scalabilities to realistic applications, in which a large amount...
Low-cost consumer depth cameras and deep learning have enabled reasonable 3D hand pose estimation from single depth images. In this paper, we present an approach that estimates 3D hand pose from regular RGB images. This task has far more ambiguities due to the missing depth information. To this end, we propose a deep network that learns a network-implicit 3D articulation prior. Together with detected...
In this study, a visible-light based fast iris ellipse fitting based gaze tracking scheme is developed for wearable eye trackers. First, after image enhancement pre-processing of eye images, the two-level binarization identifies the iris contour, and the candidate points for ellipse fittings are selected from the binaried iris profile. Next, by fast Random Sample Consensus (RANSAC) ellipse fitting,...
Hand Gesture Recognition is completed on top-view hand images observed by a Time of Flight(ToF) camera in a car. The work attempts to solve two important problems of touchless interactions inside a car. First, low latency identification of the gestures which are unobtrusive for the driver. Second, reducing the labelled data required to train learning based solutions, this is particularly important...
To robustly estimate the pose, classical methods assume some geometrical and temporal assumptions (SfM: Structure from Motion, SLAM: Simultaneous Localization and mapping). These approaches take a pair of images as input and establish correspondences based on global strategy (using the whole image information) or sparse strategy (using key-points features). These correspondences allow solving a set...
Automatic License Plate Recognition (ALPR) is an important task with many applications in Intelligent Transportation and Surveillance systems. As in other computer vision tasks, Deep Learning (DL) methods have been recently applied in the context of ALPR, focusing on country-specific plates, such as American or European, Chinese, Indian and Korean. However, either they are not a complete DL-ALPR pipeline,...
In this paper we present a skeleton-free Kinect system to estimate body mass index (BMI) of human bodies. Unlike other systems in the literature, the proposed system does not require a scale to measure the weight. The weight of observed subjects are estimated using body surface area (BSA) regression. The proposed system employs the state-of-the-art deep residual network to extract meaningful features...
This paper presents a neural-network-based approach for the detection of misplaced and missing regions in images. The main objective of this project is to develop an intelligent system that can identify a misplaced or missing region of a tested image. The system can be used to detect misplaced and missing components of printed circuit boards during the manufacturing process. Jigsaw puzzle pieces can...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.