The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Object recognition based on local features computed at multiple locations is robust to occlusions, strong viewpoint changes and object deformations. These features should be repeatable, precise and distinctive. We present an operator for repeatable feature detection on depth images (relative to 3D models) as well as 2D intensity images. The proposed detector is based on estimating the curviness saliency...
Often the filters learned by Convolutional Neural Networks (CNNs) from different image datasets appear similar. This similarity of filters is often exploited for the purposes of transfer learning. This is also being used as an initialization technique for different tasks in the same dataset or for the same task in similar datasets. Off-the-shelf CNN features have capitalized on this idea to promote...
Human detection in RGB-D images is an important yet very challenging task in computer vision. In this paper, we propose a novel human detection approach in RGB-D images, which integrates ROI (region-of-interest) generation, depth-size relationship estimation and a human detector. Our approach has the following advantages: 1) ROI generation and depth-size relationship estimation take full advantage...
In this paper, we propose an accelerated local feature extraction in a reuse scheme for action recognition. Most local features of the previous frame could be reused due to the high correlation between successive frames. Feature extraction is only needed to be applied partially in the current frame. The full-frame features of the current frame are combined by features extracted at different times...
We present a novel descriptor algorithm (DUDE) using line/point duality and a randomization strategy that provides simple but robust, consistent feature extraction and correspondence. Using duality enables us to effectively capture a distribution of line segments, and the proposed randomization strategy improves repeatability over existing techniques by generating more line features in common between...
In this paper, we present a method for affine invariant feature description. Based on the gradient distribution of an image region we calculate two basis vectors defining an affine invariant coordinate system, used to normalize the image region. The estimated basis vectors are non-orthogonal and allow for a precise representation of the gradient distribution. The proposed method can be combined with...
This paper introduces a method to guide the visual search towards a searched object, analogously to what is performed by the top-down visual attention mechanism. This is done by prioritizing scene descriptors based on their Hamming distance to the descriptors of the target. The proposal has constant space and time complexity in relation to the number of descriptors of the searched object. Moreover,...
In this work, we present a new multiple channel feature called Deep Compact Channel Feature (DCCF), which generates a compact, discriminative feature representation by a pre-trained deep encoder-decoder. With the combination of DCCF and boosted decision trees, a new object detector is proposed which achieved outstanding performance on standard pedestrian dataset INRIA and Caltech. Furthermore, a large...
Localizing heavily occluded human faces is a challenging problem in facial detection. Previous methods mainly employ sliding windows by determining whether windows include human faces. In this paper, we provide a novel segmentation-based perspective for heavily occluded face localization with deep convolutional neural networks (CNN). Our model takes an image as input without complicated pre-processing...
Detecting eyes in images is fundamental for many computer vision applications including face detection, face recognition, and human-computer interaction. Most existing methods are designed and tested on datasets acquired under controlled lab settings (e.g., fixed scale, known poses, clean background, etc.), leaving their performance to be further examined on real-world, uncontrolled images, such as...
In still images, multi-scale regions contain rich information of different granularity. However, only semantically meaningful regions provide auxiliary cues for action recognition. Moreover, regions at different scales contribute differently. Motivated by the two observations, we propose an approach that is composed of three components: 1) detecting semantic region candidates at multiple scales, 2)...
Mammogram images are now increasingly acquired with full-field digital mammography (FFDM) systems in the clinics. Traditionally, the “for-processing” format of FFDM images is used in computer-aided diagnosis (CAD) of breast cancer. In this study, we investigate the feasibility of using “for-presentation” format of FFDM (which are more readily available) in development of CAD algorithms for microcalcification...
In computer-aided diagnosis of clustered microcalcifications (MCs), the individual MCs in a lesion need to be first detected prior to subsequent classification as being benign or malignant. However, owing to noise characteristics and patient variability, the detection accuracy is often adversely compromised by the occurrence of false-positives (FPs) or missed MCs in detection. To deal with difficulty,...
On the problem of tracking objects in videos, a recent and distinguished approach combining tracking and detection methods is the TLD framework. The detector identifies the object by its supposedly confirmed appearances. The tracker inserts new appearances into the model using apparent motion. Their outcomes are integrated by using the same similarity metric of the detector which, in our point of...
This paper presents an approach for visual tracking, consisting of two combination modules, which are global detector and local image patch matching. The former gives the classification response for each object candidate specified by the sliding window in the searching region. The classification can be performed by any global detector, which is based on the feature from the local patch in the object...
In computer vision, object detection is addressed as one of the most challenging problems as it is prone to localization and classification error. The current best-performing detectors are based on the technique of finding region proposals in order to localize objects. Despite having very good performance, these techniques are computationally expensive due to having large number of proposed regions...
A two stages car detection method using deformable part models with composite feature sets (DPM/CF) is proposed to recognize cars of various types and from multiple viewing angles. In the first stage, a HOG template is matched to detect the bounding box of the entire car of a certain type and viewed from a certain angle (called a t/a pair), which yields a region of interest (ROI). In the second stage,...
This paper tackles the problem of bird detection in large landscape images for applications in the wind energy industry. While significant progress in image recognition has been made by deep convolutional neural networks (CNNs), small object detection remains a problem. To solve it, we follow the idea that a detector can be tuned to small objects of interest and semantic segmentation methods can be...
In this paper, we propose a principled framework for pornographic image recognition. Specifically, we present our definition of pornographic images, which characterizes the pornographic contents in images as the exposure of private body parts. As the private body parts often lie in local image regions, we model each image as a bag of local image patches (instances), and assume that for each pornographic...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.