The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
While recent advances in deep learning pushed the state-of-the-art in object detection and semantic segmentation, it often comes at the cost of a considerable annotation effort. Thus, weakly supervised learning became of increasing interest. In this paper a novel approach to the challenging task of weakly supervised segmentation and object localization will be presented. The problem is tackled from...
Image smoothing is a fundamental technology which aims to preserve image structure and remove insignificant texture. Balancing the trade-off between preserving structure and suppressing texture, however, is not a trivial task. This is because existing methods rely on only one guidance to infer structure or texture and assume the other is dependent. However, in many cases, textures are composed of...
One of the promising new directions for Content Based Video Retrieval is object based retrieval which allows the user to manipulate video object as a part of searching and browsing. The major obstacle for the use of objects in video retrieval is the appropriate representation of objects in video database. The purpose of this work is to present an object based framework consisting of entire processing...
Semantic parsing of large-scale 3D point clouds is an important research topic in computer vision and remote sensing fields. Most existing approaches utilize hand-crafted features for each modality independently and combine them in a heuristic manner. They often fail to consider the consistency and complementary information among features adequately, which makes them difficult to capture high-level...
Image is usually taken for expressing some kinds of emotions or purposes, such as love, celebrating Christmas. There is another better way that combines the image and relevant song to amplify the expression, which has drawn much attention in the social network recently. Hence, the automatic selection of songs should be expected. In this paper, we propose to retrieve semantic relevant songs just by...
Person re-identification is an important task in video surveillance systems. It can be formally defined as establishing the correspondence between images of a person taken from different cameras at different times. In this paper, we present a two stream convolutional neural network where each stream is a Siamese network. This architecture can learn spatial and temporal information separately. We also...
Blind image quality assessment (BIQA) methods aim to estimate the quality of a given test image without referring to the corresponding reference (original) image. Most BIQA methods use visual sensitivity models, which take into consideration intrinsic image characteristics (e.g. contrast, luminance, and texture) to identify degradations and estimate quality. For example, texture-based BIQA methods...
We report on the results of the first visual search and rating study (N60) evaluating human gaze when assessing the realism of image composites. The effects of object identity knowledge and mismatched feature type on observers' gaze and subjective realism scores are studied. Gaze metrics used include: fixation count, fixation duration, time and duration of first fixation on target object, as well...
The interactive image segmentation model allows users to iteratively add new inputs for refinement until a satisfactory result is finally obtained. Therefore, an ideal interactive segmentation model should learn to capture the user's intention with minimal interaction. However, existing models fail to fully utilize the valuable user input information in the segmentation refinement process and thus...
While natural beauty is often considered a subjective property of images, in this paper, we take an objective approach and provide methods for quantifying and predicting the scenicness of an image. Using a dataset containing hundreds of thousands of outdoor images captured throughout Great Britain with crowdsourced ratings of natural beauty, we propose an approach to predict scenicness which explicitly...
We investigate methods for combining multiple selfsupervised tasks—i.e., supervised tasks where data can be collected without manual labeling—in order to train a single visual representation. First, we provide an apples-toapples comparison of four different self-supervised tasks using the very deep ResNet-101 architecture. We then combine tasks to jointly train a network. We also explore lasso regularization...
We present a framework for learning to describe finegrained visual differences between instances using attribute phrases. Attribute phrases capture distinguishing aspects of an object (e.g., “propeller on the nose” or “door near the wing” for airplanes) in a compositional manner. Instances within a category can be described by a set of these phrases and collectively they span the space of semantic...
This paper proposes an automatic spatially-aware concept discovery approach using weakly labeled image-text data from shopping websites. We first fine-tune GoogleNet by jointly modeling clothing images and their corresponding descriptions in a visual-semantic embedding space. Then, for each attribute (word), we generate its spatiallyaware representation by combining its semantic word vector representation...
The image retrieval from multimedia databases is a very challenging problem nowadays. Not only it requires the proper query form, but also efficient methods of data storage. The problem is important, because nowadays there are many different systems which needs image retrieval. As an example web searching engines may be given, which had to store a very huge amount of images and needs fast image retrieval...
This research proposes the multispectral image retrieval method by using spectral feature and semantic computing which is not many studies have focused. The main contributions are to enhance the effectiveness and advantageous of global environmental analysis system and realize semantic associative search and analysis. In this work, we study multispectral image retrieval using spectral feature computed...
Many researches have been conducted on video abstraction for quick viewing of video archives, however there is a lack of approach that considers abstraction as a pre-processing stage in video analysis. This paper aims to investigate the efficiency of integrating video abstraction in surveillance video indexing and retrieval framework. The basic idea is to reduce the computational complexity and cost...
Easy access to high speed communication network and sharing of information via social media, has led to large amount of multimedia content seamlessly available to end user. Searching for similar images or video clips within a collection of videos is a common activity. This paper proposes a retrieval approach for similar video clip based on dense descriptor called serial walk local descriptor. The...
Multimedia semantic concept detection is one of the major research topics in multimedia data analysis in recent years. Disaster information management needs the assistance of multimedia data analysis to better utilize those disasterrelated information, which has been widely shared by people through the Internet. In this paper, a Feature Affinity based Multiple Correspondence Analysis and Decision...
Compositing is one of the most common operations in photo editing. To generate realistic composites, the appearances of foreground and background need to be adjusted to make them compatible. Previous approaches to harmonize composites have focused on learning statistical relationships between hand-crafted appearance features of the foreground and background, which is unreliable especially when the...
Improvements in color constancy have arisen from the use of convolutional neural networks (CNNs). However, the patch-based CNNs that exist for this problem are faced with the issue of estimation ambiguity, where a patch may contain insufficient information to establish a unique or even a limited possible range of illumination colors. Image patches with estimation ambiguity not only appear with great...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.