The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Cast shadows add additional difficulties on detecting objects because they locally modify image intensity and color. Shadows may appear or disappear in an image when the object, the camera, or both are free to move through a scene. This work evaluates the performance of an object detection method based on boosted HOG paired with three different image representations in outdoor video sequences. We...
This paper presents a novel approach to discovering particular objects from a set of unannotated images. We aim to find discriminative feature sets that can effectively represent particular object classes (as opposed to object categories). We achieve this by mining correlated visual word sets from the bag-of-features model. Specifically, we consider that a visual word set belongs to the same object...
Many previous image processing methods discard low-frequency components of images to extract illumination invariant for face recognition. However, this method may cause distortion of processed images and perform poorly under normal lighting. In this paper, a new method is proposed to deal with illumination problem in face recognition. Firstly, we define a score to denote a relative difference of the...
Manifold models for nonlinear dimensionality reduction provide useful low-dimensional representations of high-dimensional data. Most manifold models are unsupervised algorithms and map the entire data onto a single manifold. Heterogeneous data with multiple classes are often better modeled by multiple manifolds rather than by a single global manifold, but there is no explicit way to compare instances...
In this paper, we derive a novel multilinear relationship for close light sources and cameras. In this multilinear relationship, image intensities and image point coordinates can be handled in a single framework. We first derive a linear representation of image intensity taken under a general close light source. We next analyze multiple view geometry among close light sources and cameras, and derive...
In this paper we address the problem of relighting faces in presence of cast shadows and specularities. We present a solution to this problem by capturing the spatially varying Apparent Bidirectional Reflectance Functions (ABRDF) fields of human faces using Spline Modulated Spherical Harmonics and representing them using a few salient spherical functions called Eigenbubbles. Through extensive experiments...
In this paper, a diffusion-based iterative algorithm is proposed for illumination invariant face representation using image selective smoothing in DCT domain. In fact, we split the image I into three parts (R+w)+L of an illumination invariant component, an oscillating component and a smooth component. At each iteration, the influence of different frequency sub-bands of image is determined and the...
We developed a two-shot 6-band image capturing system consisting of a large-format camera, a customized interference filter, and a scanning digital back to capture a 185-M-pixel images. The interference filter is set in front of the camera lens to obtain a 6-band image, that is, two 3-band images, one taken with the filter and the other without it. After correction of optical aberrations caused by...
In this paper we show that the features generated by the recently presented Invariant Features of Local Textures (IFLT) technique can be used in a SIFT like framework to deliver real-time point wise image matching with performance comparable to existing state-of-the-art image matching systems. The proposed framework is also capable of saving considerable amount of computation time.
In this work, we propose a framework for classifying structured human behavior in complex real environments, where problems such as frequent illumination changes and heavy occlusions are expected. Since target recognition and tracking can be very challenging, we bypass these problems by employing an approach similar to Motion History Images for feature extraction. Furthermore, to tackle outliers residing...
The sparse representation technique has provided a new way of looking at object recognition. As we demonstrate in this paper, however, the mean-squared error (MSE) measure, which is at the heart of this technique, is not a very robust measure when it comes to comparing facial images, which differ significantly in luminance values, as it only performs pixel-by-pixel comparisons. This requires a significantly...
Most existing performance evaluation methods concentrate on defining separate metrics over a wide range of conditions and generating standard benchmarking video sequences for examining the effectiveness of video tracking systems. In other words, these methods attempt to design a robustness margin or factor for the system. These methods are deterministic in which a robustness factor, for example, 2...
In this paper the problem of eye detection across three different bands, i.e., the visible, multispectral, and short wave infrared (SWIR), is studied in order to illustrate the advantages and limitations of multi-band eye localization. The contributions of this work are two-fold. First, a multi-band database of 30 subjects is assembled and used to illustrate the challenges associated with the problem...
An endoscope is a medical instrument that acquires images inside the human body. An endoscope carries its own light source. Classic shape-from-shading can be used to recover the 3-D shape of objects in view. Recent implementations have used the Fast Marching Method (FMM). Previous FMM approaches recover 3-D shape under assumptions of parallel light source illumination and orthographic projection....
The Local Binary Pattern (LBP) operator is a computationally efficient yet powerful feature for analyzing local texture structures. While the LBP operator has been successfully applied to tasks as diverse as texture classification, texture segmentation, face recognition and facial expression recognition, etc., it has been rarely used in the domain of Visual Object Classes (VOC) recognition mainly...
Video scene text contains semantic information and thus can contribute significantly to video indexing and summarization. However, most of the previous approaches to detecting scene text from videos experience difficulties in handling texts with various character size and text alignments. In this paper, we propose a novel algorithm of scene text detection and localization in video. Based on our observation...
This paper proposes an automatic and robust method to detect and recognize the abandoned objects for video surveillance systems. Two Gaussian Mixture Models(Long-term and Short-term models) in the RGB color space are constructed to obtain two binary foreground masks. By refining the foreground masks through Radial Reach Filter (RRF) method, the influence of illumination changes is greatly reduced...
Moving cast shadow removal is an important yet difficult problem in video analysis and applications. This paper presents a novel algorithm for detection of moving cast shadows, that based on a local texture descriptor called Scale Invariant Local Ternary Pattern (SILTP). An assumption is made that the texture properties of cast shadows bears similar patterns to those of the background beneath them...
We propose a region-based foreground object segmentation method capable of dealing with image sequences containing noise, illumination variations and dynamic backgrounds (as often present in outdoor environments). The method utilises contextual spatial information through analysing each frame on an overlapping block by-block basis and obtaining a low-dimensional texture descriptor for each block....
A common computer vision task is navigation and mapping. Many indoor navigation tasks require depth knowledge of flat, unstructured surfaces (walls, floor, ceiling). With passive illumination only, this is an ill-posed problem. Inspired by small children using a torchlight, we use a spotlight for active illumination. Using our torchlight approach, depth and orientation estimation of unstructured,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.