The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We study the problem of cross-media retrieval, where the query and the returned results are of different modalities. A novel method is proposed to measure the similarity between heterogeneous media objects for cross-media retrieval. While existing methods only focus on the original low level feature spaces or the third common space, our proposed tri-space explores both of the two kinds of spaces....
Unsupervised image segmentation is an important and difficult technique in pattern recognition. In this paper, we propose an interesting region merging algorithm for segmentation of natural images. It consists of two steps: first forming initial over-segmentation by the Connected Coherence Tree Algorithm (CCTA), and then merging the primitive regions in terms of their similarity and feature in the...
This paper proposes a new manifold entropy function based on local tangent space (LTS). With this entropy function, we further propose a framework for image retrieval. The retrieval is treated as searching for ordered cycles by categories in image datasets. The optimal cycles can be found by minimizing our manifold entropy of images.
In text categorization, the dimensionality reduction methods, such as latent semantic indexing and nonnegative matrix factorization, commonly yield the dense representation that is not consistent with our common knowledge. On the other hand, the popular sparse coding methods are time-consuming and their dictionaries might contain negative entries, which is difficulty to interpret the semantic meaning...
Reranking is one of the commonly used methods to improve the initial ranking performance for content based object retrieval. In this paper, we propose a spatial consistency based selective reranking method to boost the performance of traditional reranking. After deriving the query's top results, we measure the spatial consistency degree of each query-result image pair via visual words spatial verification...
In this paper we present a multipage administrative document image retrieval system based on textual and visual representations of document pages. Individual pages are represented by textual or visual information using a bag-of-words framework. Different fusion strategies are evaluated which allow the system to perform multipage document retrieval on the basis of a single page retrieval system. Results...
We propose an approach to recognize group activities which involve several persons based on modeling the interactions between human bodies. Benefitted from the recent progress in pose estimation [1], we model the activities as the interactions between the parts belong to the same person (intra-person) and those between the parts of different persons (inter-person). Then a unified, discriminative model...
Semantic image segmentation assigns a predefined class label to each pixel. This paper proposes a unified framework by using region bank to solve this task. Images are hierarchically segmented leading to region banks. Local features and high-level descriptors are extracted on each region of the bank. Discriminative classifiers are learned based on the histograms of feature descriptors computed from...
With the explosive growth of Internet image data, labeling image data for image retrieval has become an increasingly onerous task. To that end, we proposed a novel multi-view learning with batch mode active learning framework, MV-BMAL, for improving the performance of image retrieval. Specifically, color, texture and shape features are extracted and considered as un-correlated and sufficient views...
Facial feature tracking and facial actions recognition from image sequence attracted great attention in computer vision field. Most current methods treat them as independent problems, hence ignore the interactions between facial feature points and facial actions. In this paper, we introduce a probabilistic framework based on the Dynamic Bayesian network (DBN) to simultaneously and coherently represent...
Holistic scene understanding is a major goal in recent research of computer vision. To deal with this task, reasoning the 3D relationship of components in a scene is identified as one of the key problems. We study this problem in terms of structural reconstruction of 3D scene from single view image. Our first step concentrates on geometrical layout analysis of scene using low-level features. We allocate...
Not only facial expressions but also body gestures and postures play an important role in non-verbal communication. Facial expressions are based on two factors: arousal and pleasant emotions, while it is not clear that body gestures and postures have the same structure as the facial expressions have. We indicate that (1) the sitting postures have the same emotion structure as facial expressions and...
The primary information units in a newspaper are the articles. Article reconstruction from newspapers including article aggregation and reading order recovery is known to be a quite challenging task due to the complexity of the multi-article page layout. In this paper, we propose a novel approach for article reconstruction using a bipartite graph framework, which models the complex relationships between...
In this work we present a novel approach which combines semantic information with low level features extracted from a complex video scene. The proposed method for video scene understanding relies on a bag-of-words approach, in which, typically, visual words contain information of local motion, but information regarding what generated such motion is discarded. Instead, in our framework, the semantic...
This paper focuses on tracking multiple vehicles in real-world traffic videos which is very challenging due to frequent interactions and occlusions between different vehicles. To address these problems, we fall back on superpixel which recently has received great attention in a wide range of vision problems, e.g. object segmentation, tracking and recognition, for its ability of capturing local appearance...
Non-negative matrix factorization [5](NMF) is a well known tool for unsupervised machine learning. It can be viewed as a generalization of the K-means clustering, Expectation Maximization based clustering and aspect modeling by Probabilistic Latent Semantic Analysis (PLSA). Specifically PLSA is related to NMF with KL-divergence objective function. Further it is shown that K-means clustering is a special...
There are many popular models available for classification of documents like Naïve Bayes Classifier, k-Nearest Neighbors and Support Vector Machine. In all these cases, the representation is based on the “Bag of words” model. This model doesn't capture the actual semantic meaning of a word in a particular document. Semantics are better captured by proximity of words and their occurrence in the document...
High-level visual recognition such as scene classification is a challenging task in computer vision. In this paper, we propose an image descriptor based on semantic cliques obtained by high-order pure dependence, and the image is represented by a vector whose element denotes the probability of containing each object cliques. Compared with using single objects as attributes, such representation carries...
Processing short texts is becoming a trend in information retrieval. Since the text has rarely external information, it is more challenging than document. In this paper, keyword clustering is studied for automatic categorization. To obtain semantic similarity of the keywords, a broad-coverage lexical resource WordNet is employed. We introduce a semantic hierarchical clustering. For automatic keyword...
Semantic concept detection is an important open problem in concept-based image understanding. In this paper, we develop a method inspired by social network analysis to solve the semantic concept detection problem. The novel idea proposed is the detection and utilization of concept co-occurrence patterns as contextual clues for improving individual concept detection. We detect the patterns as hierarchical...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.