The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Compression artifacts arise in images whenever a lossy compression algorithm is applied. These artifacts eliminate details present in the original image, or add noise and small structures; because of these effects they make images less pleasant for the human eye, and may also lead to decreased performance of computer vision algorithms such as object detectors. To eliminate such artifacts, when decompressing...
Person re-identification is best known as the problem of associating a single person that is observed from one or more disjoint cameras. The existing literature has mainly addressed such an issue, neglecting the fact that people usually move in groups, like in crowded scenarios. We believe that the additional information carried by neighboring individuals provides a relevant visual context that can...
We present a novel unsupervised method for face identity learning from video sequences. The method exploits the ResNet deep network for face detection and VGGface fc7 face descriptors together with a smart learning mechanism that exploits the temporal coherence of visual data in video streams. We present a novel feature matching solution based on Reverse Nearest Neighbour and a feature forgetting...
In this paper we present an efficient and effective method for visual descriptors hashing based on hierarchical multiple assignment within a k-means framework. The method has been used to address the problem of approximate nearest neighbor (ANN) retrieval, and it has been tested on local and global visual content descriptors, either engineered or learned. The proposed method has been compared to state-of-the-art...
Deep learning based approaches proved to be dramatically effective to address many computer vision applications, including "face recognition in the wild". It has been extensively demonstrated that methods exploiting Deep Convolutional Neural Networks (DCNN) are powerful enough to overcome to a great extent many problems that negatively affected computer vision algorithms based on hand-crafted...
The analysis of human gait is more and more investigated due to its large panel of potential applications in various domains, like rehabilitation, deficiency diagnosis, surveillance and movement optimization. In addition, the release of depth sensors offers new opportunities to achieve gait analysis in a non-intrusive context. In this paper, we propose a gait analysis method from depth sequences by...
Understanding where people attention focuses is a challenging and extremely valuable task that can be solved using computer vision technologies. In this paper we address this problem on surveillance-like scenarios, where head and body imagery are usually low resolution. We propose a method to profile the attention of people moving in a known space. We exploit coarse gaze estimation and a novel model...
In this paper, we propose a new and effective frontalization algorithm for frontal rendering of unconstrained face images, and experiment it for face recognition. Initially, a 3DMM is fit to the image, and an interpolating function maps each pixel inside the face region on the image to the 3D model's. Thus, we can render a frontal view without introducing artifacts in the final image thanks to the...
This paper presents a novel method for efficient image retrieval, based on a simple and effective hashing of CNN features and the use of an indexing structure based on Bloom filters. These filters are used as gatekeepers for the database of image features, allowing to avoid to perform a query if the query features are not stored in the database and speeding up the query process, without affecting...
This paper discusses the role of computer vision to bridge the experiential gap between the cultural and emotional experience of the visitors in museums or cultural heritage sites. We don't argue against the use of multiple sensors to provide a more complete cultural experience but claim the primary role of computer vision for such a task. Although many research challenges are still far to be solved...
Ensembles of Exemplar-SVMs have been used for a wide variety of tasks, such as object detection, segmentation, label transfer and mid-level feature learning. In order to make this technique effective though a large collection of classifiers is needed, which often makes the evaluation phase prohibitive. To overcome this issue we exploit the joint distribution of exemplar classifier scores to build...
Human activity recognition is a fundamental problem in computer vision with many applications such as video retrieval, automatic visual surveillance and human computer interaction. Sports represent one of the most viewed content on digital tv and the web. Automatically collected statistics of team sports game play represent actionable information for many end users such as coaches and broadcast speakers...
In this paper we present an efficient and accurate method to aggregate a set of Deep Convolutional Neural Network (CNN) responses, extracted from a set of image windows. CNN features are usually computed on the whole frame or with a dense multi scale approach. There is evidence that using multiple windows yields a better image representation nonetheless it is still not clear how windows should be...
In this paper, we propose a new approach for constructing a 3D morph able model (3DMM) and experiment its application to face recognition. Differently from existing solutions, the proposed 3DMM is constructed from a training set that includes a large spectrum of variability in terms of ethnicity and facial expressions. By exploiting annotated landmarks available in the training data, we are able of...
In this paper we describe a technique for joint estimation of head pose and multiple soft biometrics from faces (Age, Gender and Ethnicity). Our proposed Multi-Objective Random Forests (MORF) framework is a unified model for the joint estimation of multiple characteristics that automatically adapts the measure of information gain used for evaluating the quality of weak learners. Since facial characteristics...
In this paper, we present and experiment a novel approach for representing texture of 3D mesh manifolds using local binary patterns (LBP). Using a recently proposed framework [37], we compute LBP directly on the mesh surface, either using geometric or photometric appearance. Compared to its depth-image counterpart, our approach is distinguished by the following features: a) inherits the intrinsic...
In this paper we describe a new dataset, under construction, acquired inside the National Museum of Bargello in Florence. It was recorded with three IP cameras at a resolution of 1280 × 800 pixels and an average framerate of five frames per second. Sequences were recorded following two scenarios. The first scenario consists of visitors watching different artworks (individuals), while the second one...
In this paper we present an efficient method for mobile visual search that exploits compact hash codes and data structures for visual features retrieval. The method has been tested on a large scale standard dataset of one million SIFT features, showing a retrieval performance comparable or superior to state-of-the-art methods, and a very high efficiency in terms of memory consumption and computational...
An important task in computer vision is object localization and recognition within images and video. Achieving real-time object localization and recognition on low-power devices is especially relevant in the context of wearable technologies. Indeed, wearable devices have a reduced size and cost and limited computational power leading to a challenging scenario for classical computer vision algorithms...
Recognizing human actions or analyzing human behaviors from 3D videos is an important problem currently investigated in many research domains. The high complexity of human motions and the variability of gesture combinations make this task challenging. Local (over time) analysis of a sequence is often necessary in order to have a more accurate and thorough understanding of what the human is doing....
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.