The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Automatically recognising facial emotions has drawn increasing attention in computer vision. Facial landmark based methods are one of the most widely used approaches to perform this task. However, these approaches do not provide good performance. Thus, researchers usually tend to combine more information such as textural and audio information to increase the recognition rate. In this paper we propose...
Existing distance metric learning methods define an objective function and seek a distance metric (or equivalently a projection) that minimizes it. In this paper, we propose a different approach that illustrates how to formulate distance metric learning as a regression problem. First, the objective function is minimized to learn target representations. Then, a regression method is employed to learn...
This paper is concerned with event clustering for short text streams, which aims to divide constantly arriving short texts into several dynamic event-based clusters. A widely adopted approach is based on the Vector Space Models (VSMs) such as bag of words. However, these models have limitations in that not only the semantic relationships between words are largely ignored, the term weighting may also...
An unsolved problem in medical image analysis is validation of methods. In this paper we will focus on image registration and in particular on nonlinear image registration, which is one of the hardest analysis problems to validate. The paper covers currently used methods of validation, comparative challenges and public datasets, as well as some of our own work in this area.
A hash algorithm converts data into compact strings. In the multimedia domain, effective hashing is the key to large-scale similarity search in high-dimensional feature space. A limit of existing hashing techniques is that they typically use single features. In order to improve search performance, it is necessary to utilize multiple features. Due to the compactness requirement, concatenation of hash...
This paper presents fine-tuned CNN features for person re-identification. Recently, features extracted from top layers of pre-trained Convolutional Neural Network (CNN) on a large annotated dataset, e.g., ImageNet, have been proven to be strong off-the-shelf descriptors for various recognition tasks. However, large disparity among the pre-trained task, i.e., ImageNet classification, and the target...
Labeling problems are finding increasing applications to optimization problems. They usually get realized into linear or quadratic optimization problems, which are inefficient for large graphs. In this paper we propose an efficient primal-dual solution, MLPD, for a family of labeling problems. We apply this algorithm to the analysis of immune repertoires, and compare it against our baseline approach...
Video summarization is useful to find a concise representation of the original video, nevertheless its evaluation is somewhat challenging. This paper proposes a simple and efficient method for precisely evaluating the video summaries produced by the existing techniques. This method includes two steps. The first step is to establish a set of matched frames between automatic summary (AT) and the ground...
The closest string problem is a core problem in computational biology with applications in other fields like coding theory. Many algorithms exist to solve this problem, but due to its inherent high computational complexity (typically NP-hard), it can only be solved efficiently by restricting the search space to a specific range of parameters. Often, the run-time of these algorithms is exponential...
Based on minimum reconstruction error criterion and the intrinsic sparse property of natural data, sparse representation (SR) has shown promising performance on various image recognition tasks. However, in the field of person re-identification (re-id), the state-of-the-art is still dominated by other methods such as metric learning or CNN. It is because samples in one view may not be representative...
Error Correcting Output Coding (ECOC) is a multi-class classification technique in which multiple binary classifiers are trained according to a preset code matrix such that each one learns a separate dichotomy of the classes. While ECOC is one of the best solutions for multi-class problems, one issue which makes it suboptimal is that the training of the base classifiers is done independently of the...
This paper presents a novel deep architecture for saliency prediction. Current state of the art models for saliency prediction employ Fully Convolutional networks that perform a non-linear combination of features extracted from the last convolutional layer to predict saliency maps. We propose an architecture which, instead, combines features extracted at different levels of a Convolutional Neural...
This paper addresses the problem of determining whether an observed subject has already been seen in a stream of biometric samples. Given a new sample, unlike the common practice of comparing a related match score to a constant threshold, this work introduces a function which takes as input the match score and the position of that sample in the stream, and produces as output a duplicate/non-duplicate...
Realistic scene object recognition in computer vision still faces great challenges due to the large intra-class variation of object images caused by factors like object appearance variation and viewpoint change. To address this challenge, we propose to exploit the semantic relationships embedded in object taxonomy for improved object recognition. Specifically, we exploit the relationships in the object...
Class imbalance is an issue in many real world applications because classification algorithms tend to misclassify instances from the class of interest when its training samples are outnumbered by those of other classes. Several variations of AdaBoost ensemble method have been proposed in literature to learn from imbalanced data based on re-sampling. However, their loss factor is based on standard...
We propose a new superpixel algorithm based on exploiting the boundary information of an image, as objects in images can generally be described by their boundaries. Our proposed approach initially estimates the boundaries and uses them to place superpixel seeds in the areas in which they are more dense. Afterwards, we minimize an energy function in order to expand the seeds into full superpixels....
In surveillance videos, the pictures of a same person often present significant variation which makes person re-identification difficult. Though the globe appearances may present great difference, some local patches still have great similarities, and human eyes can be used to distinguish the identity of each person via these local patches. Inspired from it, patch matching is introduced in person re-identification...
Although the design of low-level local spatiotemporal features has recently led to significant improvement of performance in many action recognition applications, much less attention has been given to the equally important problem how to organize such low-level features extracted from the videos into a higher-level representation suitable to represent and discriminate between many different action...
In this paper, we propose an effective approach for automatic 4D Facial Expression Recognition (FER). The flow of 3D facial scans is first modeled to capture spatial deformations based on the recently-developed Riemannian approach, namely Dense Scalar Fields (DSF), where registration and comparison of neighboring 3D face frames are jointly led. The deformations are then fed into a temporal filtering...
In this paper, we propose a novel regularized sparse coding approach for template-based unconstrained face verification. Unlike traditional verification tasks, which require the evaluation on image-to-image or video-to-video pairs, template-based face verification/recognition methods can exploit training and/or gallery data containing a mixture of both images or videos from the person of interest...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.