The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The CNN-RNN design pattern is increasingly widely applied in a variety of image annotation tasks including multi-label classification and captioning. Existing models use the weakly semantic CNN hidden layer or its transform as the image embedding that provides the interface between the CNN and RNN. This leaves the RNN overstretched with two jobs: predicting the visual concepts and modelling their...
Given an unreliable visual patterns and insufficient query information, content-based image retrieval is often suboptimal and requires image re-ranking using auxiliary information. In this paper, we propose a discriminative multi-view interactive image re-ranking (DMINTIR), which integrates user relevance feedback capturing users’ intentions and multiple features that sufficiently describe the images...
The location information of interest points is an important cue for action recognition. In order to model the spatio-temporal distribution, we propose a novel position feature which is constructed by normalized pairwise relative positions of points. Promising performance has been achieved by Vector of Locally Aggregated Descriptors (VLAD) which gather the differences between descriptors and visual...
For the past few years, the performance of object recognition and retrieval has been substantially boosted, which is largely attributed to the advent of many effective image descriptors. The most representative examples are the Fisher Vector (FV) and the Vector of Locally Aggregated Descriptors (VLAD). In this paper we focus on the latter. The original VLAD descriptor directly accumulates the sums...
This paper presents an improved bag-of-words (BoW) framework for detecting near-duplicates of images on the Web and makes three main contributions. Firstly, based on the SIFT feature descriptors, Locality-constrained Linear Coding (LLC) with the spatial pyramid is introduced to encode features. Secondly, a weighted Chi-square distance metric is proposed to compare two histograms, with an inverted...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.