The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Most learning-based video semantic analysis methods require a large training set to achieve good performances. However, annotating a large video is laborintensive. This paper introduces how to construct the training set and reduce user involvement. There are four selection schemes proposed: clustering-based, spatial dispersiveness, temporal dispersiveness, and sample-based which can be used construct...
We present a methodology for learning novel human activities incrementally. In many real-world scenarios (e.g. YouTube), new videos of novel activities are provided additively, and the system must incrementally adjust its activity models rather than retraining the entire system after each addition. We introduce our incremental codebook learning algorithm for an efficient mining of important visual...
Modern machine learning techniques provide robust approaches for data-driven modeling and critical information extraction, while human experts hold the advantage of possessing high-level intelligence and domain-specific expertise. We combine the power of the two for anomaly detection in GPS data by integrating them through a visualization and human-computer interaction interface. In this paper we...
Randomized learning methods (i.e., Forests or Ferns) have shown excellent capabilities for various computer vision applications. However, it was shown that the tree structure in Forests can be replaced by even simpler structures, e.g., Random Naive Bayes classifiers, yielding similar performance. The goal of this paper is to benefit from these findings to develop an efficient on-line learner. Based...
In image categorization the goal is to decide if an image belongs to a certain category or not. A binary classifier can be learned from manually labeled images; while using more labeled examples improves performance, obtaining the image labels is a time consuming process. We are interested in how other sources of information can aid the learning process given a fixed amount of labeled images. In particular,...
We explore using online learning for selecting the best parameters of Bag of Words systems when searching large scale image collections. We study two algorithms for no regret online learning: Hedge algorithm that works in the full information setting, and Exp3 that works in the bandit setting. We use these algorithms for parameter selection in two scenarios: (a) using a training set to obtain weights...
Video data is becoming increasingly important in many commercial and scientific areas with the advent of applications such as digital broadcasting, video-conferencing and multimedia processing tools, and with the development of the hardware and communications infrastructure necessary to support visual applications. The objective of this work is to propose a method for event detection in a video stream...
Many state-of-the-art object recognition systems rely on identifying the location of objects in images, in order to better learn its visual attributes. In this paper, we propose four simple yet powerful hybrid ROI detection methods (combining both local and global features), based on frequently occurring keypoints. We show that our methods demonstrate competitive performance in two different types...
This paper proposes an efficient approach for object classification. This method bases on bag-of-features classification framework and extends the limits of it. It applies modified spatial PACT as local feature descriptor, which can efficiently catch image patch's characteristic. In order to address the speed bottleneck of codebook creation, extremely randomized clustering forest is used to create...
Our objective is to obtain a state-of-the art object category detector by employing a state-of-the-art image classifier to search for the object in all possible image sub-windows. We use multiple kernel learning of Varma and Ray (ICCV 2007) to learn an optimal combination of exponential χ2 kernels, each of which captures a different feature channel. Our features include the distribution of edges,...
Measuring image similarity is a central topic in computer vision. In this paper, we learn similarity from Flickr groups and use it to organize photos. Two images are similar if they are likely to belong to the same Flickr groups. Our approach is enabled by a fast Stochastic Intersection Kernel MAchine (SIKMA) training algorithm, which we propose. This proposed training method will be useful for many...
Combining multiple information sources can improve the accuracy of search in information retrieval. This paper presents a new image search strategy which combines image features together with implicit feedback from users' eye movements, using them to rank images. In order to better deal with larger data sets, we present a perceptron formulation of the Ranking Support Vector Machine algorithm. We present...
We describe an algorithm for similar-image search which is designed to be efficient for extremely large collections of images. For each query, a small response set is selected by a fast prefilter, after which a more accurate ranker may be applied to each image in the response set. We consider a class of prefilters comprising disjunctions of conjunctions (ldquoORs of ANDsrdquo) of Boolean features...
In this paper, we construct a neural-inspired computational model based on the representational capabilities of receptive fields. The proposed model, known as shape encoding receptive fields (SERF), is able to perform fast and accurate data classification and regression of multi-dimensional data. A SERF is a histogram structure that encodes the shape of multi-dimensional data relative to its center,...
The aim of this paper is to address recognition of natural human actions in diverse and realistic video settings. This challenging but important subject has mostly been ignored in the past due to several problems one of which is the lack of realistic and annotated video datasets. Our first contribution is to address this limitation and to investigate the use of movie scripts for automatic annotation...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.