The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We propose an attentive local feature descriptor suitable for large-scale image retrieval, referred to as DELE (DEep Local Feature). The new feature is based on convolutional neural networks, which are trained only with image-level annotations on a landmark image dataset. To identify semantically useful local features for image retrieval, we also propose an attention mechanism for key point selection,...
We propose Quadruplet Convolutional Neural Networks (Quad-CNN) for multi-object tracking, which learn to associate object detections across frames using quadruplet losses. The proposed networks consider target appearances together with their temporal adjacencies for data association. Unlike conventional ranking losses, the quadruplet loss enforces an additional constraint that makes temporally adjacent...
We propose a novel algorithm to cluster and annotate a set of input images jointly, where the images are clustered into several discriminative groups and each group is identified with representative labels automatically. For these purposes, each input image is first represented by a distribution of candidate labels based on its similarity to images in a labeled reference image database. A set of these...
A macrofeature layout selection is proposed for object detection. Macrofeatures [2] are mid-level features that jointly encode a set of low-level features in a neighborhood. Our method employs line, triangle, and pyramid layouts, which are composed of several local blocks in a multi-scale feature pyramid. The method is integrated into boosting for detection, where the best layout is selected for a...
We present an automatic and efficient method to extract spatio-temporal human volumes from video, which combines top-down model-based and bottom-up appearance-based approaches. From the top-down perspective, our algorithm applies shape priors probabilistically to candidate image regions obtained by pedestrian detection, and provides accurate estimates of the human body areas which serve as important...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.