The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Existing maximum-margin support vector machines (SVMs) generate a hyperplane which produces the clearest separation between positive and negative feature vectors. These SVMs are effective when datasets are large. However, when few training samples are available, the hyperplane is easily influenced by outliers that are geometrically located in the opposite class. We propose a modified SVM which weights...
This work proposes a novel person re-identification method based on Hierarchical Bipartite Graph Matching. Because human eyes observe person appearance roughly first and then goes further into the details gradually, our method abstracts person image from coarse to fine granularity, and finally into a three layer tree structure. Then, three bipartite graph matching methods are proposed for the matching...
Matching specific persons across scenes, known as person re-identification, is an important yet unsolved computer vision problem. Feature representation and metric learning are two fundamental factors in person re-identification. However, current person re-identification methods, which use single handcrafted feature with corresponding metric, could be not powerful enough when facing illumination,...
Associating groups of people across non-overlapping camera views is an important but unsolved problem. Compared with the similar person re-identification task, group re-identification introduces some new challenges, such as significant deformation in uncontrolled directions, great intra-group occlusions and so on. In this paper, we propose a novel patch matching based framework for group re-identification...
This paper presents a method for dataset manipulation based on Mixed Integer Linear Programming (MILP). The proposed optimization can narrow down a dataset to a particular size, while enforcing specific distributions across different dimensions. It essentially leverages the redundancies of an initial dataset in order to generate more compact versions of it, with a specific target distribution across...
Traditional methods for motion estimation estimate the motion field F between a pair of images as the one that minimizes a predesigned cost function. In this paper, we propose a direct method and train a Convolutional Neural Network (CNN) that when, at test time, is given a pair of images as input it produces a dense motion field F at its output layer. In the absence of large datasets with ground...
The main purpose of transfer learning is to resolve the problem of different data distribution, generally, when the training samples of source domain are different from the training samples of the target domain. Prediction of salient areas in natural video suffers from the lack of large video benchmarks with human gaze fixations. Different databases only provide dozens up to one or two hundred of...
Document is unavailable: This DOI was registered to an article that was not presented by the author(s) at this conference. As per section 8.2.1.B.13 of IEEE's "Publication Services and Products Board Operations Manual," IEEE has chosen to exclude this article from distribution. We regret any inconvenience.
We consider the use of transfer learning, via the use of deep Convolutional Neural Networks (CNN) for the image classification problem posed within the context of X-ray baggage security screening. The use of a deep multi-layer CNN approach, traditionally requires large amounts of training data, in order to facilitate construction of a complex complete end-to-end feature extraction, representation...
Researches in neuroscience and biological vision have shown that the bio-inspired methods have excellent recognition performance, such as the salient detection, artificial neural network and the ganglion cell inspired image feature. In this paper, we introduce a novel framework towards scene classification using category-specific salient region(CSSR) with deep CNN features, called Deep-CSSR. Firstly,...
We propose a supervised approach to the classification and segmentation of material regions in hyperspectral imagery. Our algorithm is a two-stage process, combining a pixelwise classification step with a segmentation step aiming to minimise the total perimeters of the resulting regions. Our algorithm is distinctive in its ability to ensure label consistency within local homogeneous areas and to generate...
Distributed object recognition is a significantly fast-growing research area, mainly motivated by the emergence of high performance cameras and their integration with modern wireless sensor network technologies. In wireless distributed object recognition, the bandwidth is limited and it is desirable to avoid transmitting redundant visual features from multiple cameras to the base station. In this...
This method introduces an efficient manner of learning action categories without the need of feature estimation. The approach starts from low-level values, in a similar style to the successful CNN methods. However, rather than extracting general image features, we learn to predict specific video representations from raw video data. The benefit of such an approach is that at the same computational...
In this paper we propose an online multi-task learning algorithm for video concept detection. In particular, we extend the Efficient Lifelong Learning Algorithm (ELLA) in the following ways: a) we solve the objective function of ELLA using quadratic programming instead of solving the Lasso problem, b) we add a new label-based constraint that considers concept correlations, c) we use linear SVMs as...
Density estimation based visual object counting (DE-VOC) methods estimate the counts of an image by integrating over its predicted density map. They perform effectively but inefficiently. This paper proposes a fast DE-VOC method but maintains its effectiveness. Essentially, the feature space of image patches from VOC can be clustered into subspaces, and the examples of each subspace can be collected...
On the problem of tracking objects in videos, a recent and distinguished approach combining tracking and detection methods is the TLD framework. The detector identifies the object by its supposedly confirmed appearances. The tracker inserts new appearances into the model using apparent motion. Their outcomes are integrated by using the same similarity metric of the detector which, in our point of...
Computationally transcribing historical document images to digital text often requires an initial, labor intensive recording of ground-truths by language experts to provide the OCR system with training text. This paper presents a framework for the automatic generation of training data, provided only with labeled character images and a digital font, thus removing the need for manually generated text...
Color constancy is the ability of the human visual system to perceive constant colors for a surface despite changes in the spectrum of the illumination. In computer vision, the main approach consists in estimating the illuminant color and then to remove its impact on the color of the objects. Many image processing algorithms have been proposed to tackle this problem automatically. However, most of...
Estimating the number of vehicles present in traffic video sequences is a common task in applications such as active traffic management and automated route planning. There exist several vehicle counting methods such as Particle Filtering or Headlight Detection, among others. Although Principal Component Pursuit (PCP) is considered to be the state-of-the-art for video background modeling, it has not...
A two stages car detection method using deformable part models with composite feature sets (DPM/CF) is proposed to recognize cars of various types and from multiple viewing angles. In the first stage, a HOG template is matched to detect the bounding box of the entire car of a certain type and viewed from a certain angle (called a t/a pair), which yields a region of interest (ROI). In the second stage,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.