The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper develops a new algorithm based on Bag-of-Word to reflect spatial relationship of objects for visual object categorization. Beyond existing spatial pyramid for image representation, our contributions are the following: 1) we propose a combinational detector based on Maximally Stable Extremal Regions detector and Hessian-Laplacian detector to extract more discriminative features; 2) for object...
This paper proposes a novel image feature representation method, called multi-BOF histogram, for ear recognition. Given an ear image, we at first convolve it with J Gabor filters sharing the same parameters except the parameter of orientation. Then they obtained responses of each pixel at each scale and orientation can get J features. Then, each pixel can be assigned a unique features vector, namely...
When imaging the heart, using a 2D ultrasound probe, different views can manifest depending on the location and angulations of the probe. Some of these views have been labeled as standard views, due to the presentation and ease of assessment of key cardiac structures in them. We present an approach for automatic recognition and classification of these standard views, as a potential enabler for automated...
This paper presents a fast vehicle recognition and vehicle retrieval system based on “bag of words”. In this system, the input is an image of vehicle and the vehicle will be identified automatically, it can also retrieve images which are similar to the input image. 3742 vehicle images which include 28 types of vehicles are collected as the image database. Features of these images are extracted and...
Mobile visual search systems compare images against a database for object recognition. If query data is transmitted over a slow network or processed on a congested server, the latency increases substantially. This article shows how on-device database matching guarantees fast recognition regardless of external conditions. The database signatures must be compact because of limited memory, capable of...
We propose an autocorrelation Cox process that extends the traditional bag-of-words representation to model the spatio-temporal context within a video sequence. Bag-of-words models are effective tools for representing a video by a histogram of visual words that describe local appearance and motion. A major limitation of this model is its inability to encode the spatio-temporal structure of visual...
This paper quantifies existing techniques for feature detection in human action recognition. Four different feature detection approaches are investigated using Motion SIFT descriptor, a standard bag-of-features SVM classifier with x2 kernel. Specifically we used two popular feature detectors; Motion SIFT (MOSIFT) and Motion FAST (MOFAST) with and without Statis interest points. The system was tested...
With development of content-based image retrieval (CBIR), mobile visual search (MVS) is a promising application. In typical MVS, similar images are retrieved from the database maintained by the server, given a query image taken by mobile devices. Different from general CBIR, the problem of transmission latency should be considered in MVS. In existing work, the progressive transmission is proposed...
Bag of words (BoW) model, which was originally used for document processing field, has been introduced to computer vision field recently and used in object recognition successfully. However, in face recognition, the order less collection of local patches in BoW model cannot provide strong distinctive information since the objects (face images) belong to the same category. A new framework for extracting...
The human motion analysis is an attractive topic in biometric research. Common biometrics is usually time-consuming, limited and collaborative. These drawbacks pose major challenges to recognition process. Recent researches indicate people have considerable ability to recognize others by their natural walking. Therefore, gait recognition has obtained great tendency in biometric systems. Gait analysis...
Visual Sonification is the process of converting visual properties of objects into sound signals. This paper describes the Michigan Visual Sonification System (MVSS) that utilizes this process to assist the visually impaired in distinguishing different objects in their surroundings. MVSS uses depth information to first segment and localize salient objects and then represents an object's appearance...
This paper presents a new method for object tracking based on global spatial correspondence with the geometric distribution of visual words. “Spatial Pyramid Histogram” - SPH is produced by partitioning the image into increasing sub-blocks and computing histograms of features found inside each sub-block. SIFT descriptors are extracted to represent the object to construct a visual dictionary. A classifier...
Current state-of-art of image retrieval methods represent images as an unordered collection of local patches, each of which is classified as a "visual word" from a fixed vocabulary. This paper presents a simple but innovative way to uncover the spatial relationship between visual words so that we can discover words that represent the same latent topic and thereby improve the retrieval results...
How to fuse static and dynamic information is a key issue in event analysis. In this paper, a top-down motion guided fusing method is proposed for recognizing events in an unconstrained news video. In the method, the static information is represented as a Bag-of-SIFT-features and motion information is employed to generate event specific attention map to direct the sampling of the interest points....
We propose a scene classification method, which combines two popular methods in the literature: Spatial Pyramid Matching (SPM) and probabilistic Latent Semantic Analysis (pLSA) modeling. The proposed scheme called Cascaded pLSA performs pLSA in a hierarchical sense after the soft-weighted BoW representation based on dense local features is extracted. We associate spatial layout information by dividing...
An efficient min-Hash based algorithm for discovery of dependencies in sparse high-dimensional data is presented. The dependencies are represented by sets of features co-occurring with high probability and are called co-ocsets. Sparse high dimensional descriptors, such as bag of words, have been proven very effective in the domain of image retrieval. To maintain high efficiency even for very large...
Currently, the bag of visual words (BOW) representation has received wide applications in object categorization. However, the BOW representation ignores the dependency relationship among visual words, which could provide informative knowledge to understand an image. In this paper, we first design a simple method to discover this dependency through computing the spatial correlation between visual words...
We describe a method for filtering object category from a large number of noisy images. This problem is particularly difficult due to the greater variation within object categories and lack of labeled object images. Our method deals with it by combining a co-training algorithm CoBoost with two features - 1st and 2nd order features, which define bag of words representation and spatial relationship...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.