The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Venue photos, as a new type of multimedia contents, are exploding on the Internet because users like to take photos and share with their friends in which venue they spent time and what impressed them there. Discovering a venue by a social photo is very useful for supplementing venue retrieval and recommendation. However, little research focused on fine-grained venue discovery by leveraging multimodal...
Convolutional neural network (CNN) has drawn increasing interest in visual tracking, among which fully-convolutional Siamese network based method (SiamFC) is quite popular due to its competitive performance in both precision and efficiency. Generally, SiamFC captures robust semantics from high-level features in the last layer but ignores detailed spatial features in earlier layers, thus tending to...
The recent rise in the use of social networks has resulted in an abundance of information on different aspects of everyday social activities that is available online. In the process of analysis of identifying the information originating from social networks, and especially Twitter, an important aspect is that of the geographic coordinates, i.e., geolocalisation, of the relevant information. Geolocalized...
Visual tracking is a very challenging problem in computer vision as the performance of a tracking algorithm may be degraded due to many challenging issues in the scenes, such as illumination change, deformation, and background clutter. So far no algorithms can handle all these challenging issues. Recently, it has been shown that correlation filters can be implemented efficiently and, with suitable...
The popularly used subjective estimator- mean opinion score (MOS) is often biased by the testing environment, viewers mode, domain expertise, and many other factors that may actively influence on actual assessment. We therefore, devise a no- reference subjective quality assessment metric by exploiting the nature of human eye browsing on videos. The participants' eye-tracker recorded gaze-data indicate...
Dexterous object manipulation requires suitable control of grip force, load force and digit positions. To keep an object stable in the air, force magnitude, force direction and digit positions should be coordinated, producing compensatory torque that could balance the external torque and maintain a minimal object roll. Six males and 6 females enrolled in the study. In the experiment, subjects were...
Image is usually taken for expressing some kinds of emotions or purposes, such as love, celebrating Christmas. There is another better way that combines the image and relevant song to amplify the expression, which has drawn much attention in the social network recently. Hence, the automatic selection of songs should be expected. In this paper, we propose to retrieve semantic relevant songs just by...
In this paper, we investigate a weakly-supervised object detection framework. Most existing frameworks focus on using static images to learn object detectors. However, these detectors often fail to generalize to videos because of the existing domain shift. Therefore, we investigate learning these detectors directly from boring videos of daily activities. Instead of using bounding boxes, we explore...
Automatic image aesthetics rating has received a growing interest with the recent breakthrough in deep learning. Although many studies exist for learning a generic or universal aesthetics model, investigation of aesthetics models incorporating individual user’s preference is quite limited. We address this personalized aesthetics problem by showing that individual’s aesthetic preferences exhibit strong...
We propose a novel memory network model named Read-Write Memory Network (RWMN) to perform question and answering tasks for large-scale, multimodal movie story understanding. The key focus of our RWMN model is to design the read network and the write network that consist of multiple convolutional layers, which enable memory read and write operations to have high capacity and flexibility. While existing...
Correlation Filters (CFs) have recently demonstrated excellent performance in terms of rapidly tracking objects under challenging photometric and geometric variations. The strength of the approach comes from its ability to efficiently learn - on the fly - how the object is changing over time. A fundamental drawback to CFs, however, is that the background of the target is not modeled over time which...
Discriminative correlation filters (DCFs) have been shown to perform superiorly in visual tracking. They only need a small set of training samples from the initial frame to generate an appearance model. However, existing DCFs learn the filters separately from feature extraction, and update these filters using a moving average operation with an empirical weight. These DCF trackers hardly benefit from...
Visual object tracking is a fundamental and time-critical vision task. Recent years have seen many shallow tracking methods based on real-time pixel-based correlation filters, as well as deep methods that have top performance but need a high-end GPU. In this paper, we learn to improve the speed of deep trackers without losing accuracy. Our fundamental insight is to take an adaptive approach, where...
Programmers of all experience levels attempt to leverage code snippets with varying success, often as reminders or to learn new skills. To date, little work has explored the specific elements within code snippets that are challenging for novices. Comparing how novices and experts recall code snippets may expose what code elements programmers focus on and inform new approaches for improving examples...
Visual tracking is a challenging task due to a number of factors, such as occlusions, deformations, illumination variations and abrupt motion changes present in a video sequence. Generally, trackers are robust to some of these factors, but do not achieve satisfactory results when dealing with multiple factors at the same time. More robust results when multiple factors are present can be obtained by...
Zero-shot learning (ZSL) aims to transfer knowledge from observed classes to the unseen classes, based on the assumption that both the seen and unseen classes share a common semantic space, among which attributes enjoy a great popularity. However, few works study whether the human-designed semantic attributes are discriminative enough to recognize different classes. Moreover, attributes are often...
Given a textual description of an image, phrase grounding localizes objects in the image referred by query phrases in the description. State-of-the-art methods address the problem by ranking a set of proposals based on the relevance to each query, which are limited by the performance of independent proposal generation systems and ignore useful cues from context in the description. In this paper, we...
Functional connectomes (FCs) are powerful in characterizing brain conditions. Temporal FC metrics can index changes in macroscopic neural activity patterns underlying critical aspects of cognition and behavior. However, time-varying properties of temporal brain networks in general mental disorders have been less investigated. In this paper, FCs derived from resting-state fMRI (R-fMRI) data are temporally...
When looking at an image, humans shift their attention towards interesting regions, making sequences of eye fixations. When describing an image, they also come up with simple sentences that highlight the key elements in the scene. What is the correlation between where people look and what they describe in an image? To investigate this problem, we look into eye fixations and image captions, two types...
Biometrie recognition of persons are widely explored nowadays to develop robust and trustworthy security systems. On account of the unique neural signature of each person, the brain activity recorded by Electroencephalogram (EEG) has recently been identified as a potential biometric trait. In this paper, we propose an online EEG-based biometric system which utilizes the activations of brain towards...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.