Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
SIFT flow adopts SIFT descriptor to find correspondence between two images. However, SIFT flow is not robust to scale and rotation for dense corresponding matching. In this paper, we propose moment-based dense correspondence matching which is robust to image variation. First, we apply Zernike moments to SIFT descriptor, i.e. Moments of Gradients (MoG). Then, we combine SIFT flow with MoG for dense...
Robust visual object tracking against occlusions and deformations is still very challenging task. To tackle these issues, existing Convolutional Neural Networks (CNNs) based trackers either fail to handle them or can just run in low speed. In this paper, we present a realtime tracker which is robust to occlusions and deformations based on a Region-based, Multi-Scale Fully Convolutional Siamese Network...
A near-light perspective shape from shading (SfS) technique applied to endoscopy for 3D visualizations of the gastrointestinal tract regions is presented. By utilizing an extensible reflectance model, we study a robust Huber regularization function based variational SfS model. A balancing parameter is used for weighting the irradiance ad smoothness/regularization terms. Experimental results on different...
Visual tracking is a very challenging problem in computer vision as the performance of a tracking algorithm may be degraded due to many challenging issues in the scenes, such as illumination change, deformation, and background clutter. So far no algorithms can handle all these challenging issues. Recently, it has been shown that correlation filters can be implemented efficiently and, with suitable...
Many of the existing methods for learning joint embedding of images and text use only supervised information from paired images and its textual attributes. Taking advantage of the recent success of unsupervised learning in deep neural networks, we propose an end-to-end learning framework that is able to extract more robust multi-modal representations across domains. The proposed method combines representation...
The power of modern image matching approaches is still fundamentally limited by the abrupt scale changes in images. In this paper, we propose a scale-invariant image matching approach to tackling the very large scale variation of views. Drawing inspiration from the scale space theory, we start with encoding the image’s scale space into a compact multi-scale representation. Then, rather than trying...
Convolutional neural networks (CNNs) provide the current state of the art in visual object classification, but they are far less accurate when classifying partially occluded objects. A straightforward way to improve classification under occlusion conditions is to train the classifier using partially occluded object examples. However, training the network on many combinations of object instances and...
Mobile phones equipped with a monocular camera and an inertial measurement unit (IMU) are ideal platforms for augmented reality (AR) applications, but the lack of direct metric distance measurement and the existence of aggressive motions pose significant challenges on the localization of the AR device. In this work, we propose a tightly-coupled, optimization-based, monocular visual-inertial state...
Visual tracking is a challenging task due to a number of factors, such as occlusions, deformations, illumination variations and abrupt motion changes present in a video sequence. Generally, trackers are robust to some of these factors, but do not achieve satisfactory results when dealing with multiple factors at the same time. More robust results when multiple factors are present can be obtained by...
Invisibility and robustness are two important performance indicators of watermarking algorithm. To improve the performance of watermarking algorithms, the visual model is introduced, and the most classic is the Watson model. However, the original Watson model is defective in resisting amplitude scaling. In this paper, we will propose a new improvement of the Watson model to overcome its shortcomings...
An application of artificial vision and artificial neural networks techniques in face recognition, is presented. In order to do that, a set of images (frontal face photos) with different lighting conditions, gestures, accessories and distances is used. A stepwise algorithm allows to achieve a satisfactory results, obtaining the correct identification of images inside and outside the data set.
Visual Secret Sharing (VSS) is a type of cryptographic method used to secure digital media such as images by splitting it into n shares. Then, with k or more shares, the secret media can be reconstructed. Without the required number of shares, they are totally useless individually. The purpose of secret sharing methods is to reinforce the cryptographic approach from different points of failure as...
In this paper, a temporally iterative Gaussian Mixture Model (GMM) of Dynamic Texture (DT) for target detection using a moving PTZ camera, is proposed. Camera movement in a PTZ sensor causes motion-based target detection techniques to fail for the periods affected by the scene change. This is because the whole scene is considered a representation of the target motion. When the camera is in motion,...
Human sketches are unique in being able to capture both the spatial topology of a visual object, as well as its subtle appearance details. Fine-grained sketch-based image retrieval (FG-SBIR) importantly leverages on such fine-grained characteristics of sketches to conduct instance-level retrieval of photos. Nevertheless, human sketches are often highly abstract and iconic, resulting in severe misalignments...
Instead of using HOG feature on cells or blocks, the extraction of HOG features on corner points is proposed for multiple object visual tracking system in which single or multiple moving objects could be classified. Background subtraction and extraction of corner feature are applied to track and classify the moving objects. Firstly, moving objects will be detected in the form of regions from background...
Visual question answering (VQA) is challenging because it requires a simultaneous understanding of both the visual content of images and the textual content of questions. The approaches used to represent the images and questions in a fine-grained manner and questions and to fuse these multimodal features play key roles in performance. Bilinear pooling based models have been shown to outperform traditional...
Textual-visual matching aims at measuring similarities between sentence descriptions and images. Most existing methods tackle this problem without effectively utilizing identity-level annotations. In this paper, we propose an identity-aware two-stage framework for the textual-visual matching problem. Our stage-1 CNN-LSTM network learns to embed cross-modal features with a novel Cross-Modal Cross-Entropy...
In this paper we do staircase detection with a stereo vision based algorithm through NAO robot, using one of this cameras. Robot programming was implemented in Python language using ROS software. The detection algorithm is divided in two parts: line detection and depth perception. In line detection process we use Hough transform and vanishing point criteria for line segmentation. Respecting depth...
We present in this paper a novel approach for training a topological deep neural network with visual impression. We show that by combing denoising auto-encoder model and contractive auto-encoder with Hessian regularization model, we can achieve a deterministic auto-encoder aiming for robustness to small variations of the input. We exploit the tangent propagation algorithm to show how our algorithm...
We explore a new sensor suite to provide a precise and robust navigation information, primarily intended for pedestrian localisation. We use an IMU sensor augmented with an array of magnetometers, called MIMU (for Magneto-Inertial measurement Unit) hereafter, and a single central camera as the vision sensor. The MIMU sensor has been shown in previous work to significantly improve the inertial dead-reckoning...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.