The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Dense captioning is a newly emerging computer vision topic for understanding images with dense language descriptions. The goal is to densely detect visual concepts (e.g., objects, object parts, and interactions between them) from images, labeling each with a short descriptive phrase. We identify two key challenges of dense captioning that need to be properly addressed when tackling the problem. First,...
Given a convolutional neural network (CNN) that is pre-trained for object classification, this paper proposes to use active question-answering to semanticize neural patterns in conv-layers of the CNN and mine part concepts. For each part concept, we mine neural patterns in the pre-trained CNN, which are related to the target part, and use these patterns to construct an And-Or graph (AOG) to represent...
Current uses of tagged images typically exploit only the most explicit information: the link between the nouns named and the objects present somewhere in the image. We propose to leverage “unspoken” cues that rest within an ordered list of image tags so as to improve object localization. We define three novel implicit features from an image's tags—the relative prominence of each object as signified...
In this paper, we present a direct application of Support Vector Machine with Augmented Features (AFSVM) for video concept detection. For each visual concept, we learn an adapted classifier by leveraging the pre-learnt SVM classifiers of other concepts. The solution of AFSVM is to re-train the SVM classifier using augmented feature, which concatenates the original feature vector with the decision...
This work examines the possibility of exploiting, for the purpose of video segmentation to scenes, semantic information coming from the analysis of the visual modality. This information, in contrast to the low-level visual features typically used in previous approaches, is obtained by application of trained visual concept detectors such as those developed and evaluated as part of the TRECVID High-Level...
Recently, in the fields of internet and social networking, the classification and filtering of naked images has been receiving a significant amount of attention. In this paper, we propose a novel naked image classification which can make effective use of semantic features of a naked image. In addition, a novel measurement, termed accumulated distance ratio (ADR), is proposed in order to systematically...
Automatic TV commercial detection has become an indispensable part of content-based video analysis technique due to the explosive growth in TV commercial volume. In this paper, a multi-modal (i.e. visual, audio and textual modalities) commercial digesting scheme is proposed to alleviate two challenges in commercial detection, which are the generation of mid-level semantic descriptor and the application...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.