The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The intensive annotation cost and the rich but unlabeled data contained in videos motivate us to propose an unsupervised video-based person re-identification (re-ID) method. We start from two assumptions: 1) different video tracklets typically contain different persons, given that the tracklets are taken at distinct places or with long intervals; 2) within each tracklet, the frames are mostly of the...
Automatic image aesthetics rating has received a growing interest with the recent breakthrough in deep learning. Although many studies exist for learning a generic or universal aesthetics model, investigation of aesthetics models incorporating individual user’s preference is quite limited. We address this personalized aesthetics problem by showing that individual’s aesthetic preferences exhibit strong...
While strong progress has been made in image captioning recently, machine and human captions are still quite distinct. This is primarily due to the deficiencies in the generated word distribution, vocabulary size, and strong bias in the generators towards frequent captions. Furthermore, humans – rightfully so – generate multiple, diverse captions, due to the inherent ambiguity in the captioning task...
Despite the substantial progress in recent years, the image captioning techniques are still far from being perfect. Sentences produced by existing methods, e.g. those based on RNNs, are often overly rigid and lacking in variability. This issue is related to a learning principle widely used in practice, that is, to maximize the likelihood of training samples. This principle encourages high resemblance...
Many existing person re-identification (PRID) methods typically attempt to train a faithful global metric offline to cover the enormous visual appearance variations, so as to directly use it online on various probes for identity match- ing. However, their need for a huge set of positive training pairs is very demanding in practice. In contrast to these methods, this paper advocates a different paradigm:...
Image captioning is a challenging problem owing to the complexity in understanding the image content and diverse ways of describing it in natural language. Recent advances in deep neural networks have substantially improved the performance of this task. Most state-of-the-art approaches follow an encoder-decoder framework, which generates captions using a sequential recurrent prediction model. However,...
A novel dataset for benchmarking image-based localization is presented. With increasing research interests in visual place recognition and localization, several datasets have been published in the past few years. One of the evident limitations of existing datasets is that precise ground truth camera poses of query images are not available in a meaningful 3D metric system. This is in part due to the...
Person re-identification is an important technique towards automatic search of a person's presence in a surveillance video. Two fundamental problems are critical for person re-identification:feature representation and metric learning. At present, there are many methods in the study of person re-identification, which has achieved remarkable results. Due to the difference of the data distribution in...
Fine-grained visual recognition aims to capture discriminative characteristics amongst visually similar categories. The state-of-the-art research work has significantly improved the fine-grained recognition performance by deep metric learning using triplet network. However, the impact of intra-category variance on the performance of recognition and robust feature representation has not been well studied...
Horizon or skyline detection plays a vital role towards mountainous visual geo-localization, however most of the recently proposed visual geo-localization approaches rely on user-in-the-loop skyline detection methods. Detecting such a segmenting boundary fully autonomously would definitely be a step forward for these localization approaches. This paper provides a quantitative comparison of four such...
In this paper we use a Deep Neural Network (DNN) trained on data collected from the visual media-sharing social platform Instagram account of a popular Indian lifestyle magazine to predict the popularity of future posts. This predicted popularity of the post can be used to decide advertising rates and measure performance metrics important for publishing strategy decisions. The DNN primarily uses growth...
Pulmonary emphysema overlaps considerably with chronic obstructive pulmonary disease (COPD), and is traditionally subcategorized into three subtypes: centrilobular emphysema (CLE), panlobular emphysema (PLE) and paraseptal emphysema (PSE). Automated classification methods based on supervised learning are generally based upon the current definition of emphysema subtypes, while unsupervised learning...
In this paper, we explore the redundancy in convolutional neural network, which scales with the complexity of vision tasks. Considering that many front-end visual systems are interested in only a limited range of visual targets, the removing of task-specified network redundancy can promote a wide range of potential applications. We propose a task-specified knowledge distillation algorithm to derive...
The H-KWS 2016, organized in the context of the ICFHR 2016 conference aims at setting up an evaluation framework for benchmarking handwritten keyword spotting (KWS) examining both the Query by Example (QbE) and the Query by String (QbS) approaches. Both KWS approaches were hosted into two different tracks, which in turn were split into two distinct challenges, namely, a segmentation-based and a segmentation-free...
This paper presents a new representation for handwritten math formulae: a Line-of-Sight (LOS) graph over handwritten strokes, computed using stroke convex hulls. Experimental results using the CROHME 2012 and 2014 datasets show that LOS graphs capture the visual structure of handwritten formulae better than commonly used graphs such as Time-series, Minimum Spanning Trees, and k-Nearest Neighbor graphs...
Anthropology studies show that genetic features are inherited by children from their parents resulting in visual resemblance between them. This paper presents a novel SIFT flow based genetic Fisher vector feature (SF-GFVF) which enhances the facial genetic features for kinship verification. The proposed SF-GFVF feature is derived by applying a novel similarity enhancement method based on SIFT flow...
Perceptual learning sculpts ongoing brain activity [1]. This finding has been observed by statistically comparing the functional connectivity (FC) patterns computed from resting-state functional MRI (rs-fMRI) data recorded before and after intensive training to a visual attention task. Hence, functional connectivity serves a dynamic role in brain function, supporting the consolidation of previous...
This paper summarizes the MSR Image Recognition Challenge (IRC) running with ICME 2016 Grand Challenges. Since 2013, Microsoft Research has hosted a series of IRCs to motivate the academic and industrial community to solve real-world large-scale image retrieval and recognition problems. This IRC in ICME 2016 continually leveraged the Clickture dataset [1], a large-scale real-world image click data...
What makes a person pick certain tags over others when tagging an image? Does the order that a person presents tags for a given image follow an implicit bias that is personal? Can these biases be used to improve existing automated image tagging systems? We show that tag ordering, which has been largely overlooked by the image tagging community, is an important cue in understanding user tagging behavior...
Diversity is a key characteristic of a classifier ensemble. A classifier ensemble must be composed of base classifiers with different performance in different areas of the problem space. Several works studied different diversity measures by performing extensive numerical experiments. However, up to our knowledge, no method has been proposed to visualize the diversity of a classifier ensemble. In this...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.