The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Facial composite technologies are used to produce visual resemblances of an offender. However, resemblances may be poor, particularly when composites are constructed using traditional ‘feature’ composite systems deployed several days after the crime. In this case a witness may have forgotten important details about an offender's appearance. Engaging in early and repeated retrieval attempts could potentially...
This paper presents the evaluation of visual features for the proposed two eye detection method applied to thermal images. The use of two eye region is due to its distinctive pattern and to overcome the issue of blurred and noisy characteristic in the thermal image. Comparative performance analysis on three different features which includes Haar, Histogram of Oriented Gradients (HoG) and Local Binary...
In this paper, the problem of age estimation is addressed based on two modalities: speech utterances and speakers' face images. The proposed age estimation framework employs the Shifted Covariates REgression Analysis for Multi-way data (SCREAM) model, which combines Parallel Factor Analysis 2 and Principal Covariates Regression. SCREAM is able to extract a few latent variables from multi-way data...
This paper describes the techniques used in the submitted video presenting an interaction scenario, realised using the Neuro-Inspired Companion (NICO) robot. NICO engages the users in a personalised conversation where the robot always tracks the users' face, remembers them and interacts with them using natural language. NICO can also learn to perform tasks such as remembering and recalling objects...
In this paper, we present a novel image scaling method that employs a mesh model that explicitly represents discontinuities in the image. Our method effectively addresses the problem of preserving the sharpness of edges, which has always been a challenge, during image enlargement. We use a constrained Delaunay triangulation to generate the model and an approximating function that is continuous everywhere...
The goal of this work is to recognise phrases and sentences being spoken by a talking face, with or without the audio. Unlike previous works that have focussed on recognising a limited number of words or phrases, we tackle lip reading as an open-world problem – unconstrained natural language sentences, and in the wild videos. Our key contributions are: (1) a Watch, Listen, Attend and Spell...
Absence of a clear eye visibility not only degrades the aesthetic value of an entire face image but also creates difficulties in many computer vision tasks. Even mild reflections produce the undesired superpositions of visual information, whose decomposition into the background and reflection layers using a single image is a highly ill-posed problem. In this work, we enforce the tight constraints...
Social relations are the foundation of human daily life. Developing techniques to analyze such relations from visual data bears great potential to build machines that better understand us and are capable of interacting with us at a social level. Previous investigations have remained partial due to the overwhelming diversity and complexity of the topic and consequently have only focused on a handful...
Discovering the common (joint) and individual subspaces is crucial for analysis of multiple data sets, including multi-view and multi-modal data. Several statistical machine learning methods have been developed for discovering the common features across multiple data sets. The most well studied family of the methods is that of Canonical Correlation Analysis (CCA) and its variants. Even though the...
Personal photographs shared in social network websites show a wide range of variations in illumination, pose and expression. As a result, conventional face recognition methods show poor performance in such uncontrolled setting. To resolve this problem, we explore incorporating associated social network context into visual face recognition system. Motivated by collaborative filtering technique, a new...
The deployments of deep neural network models on mobile or embedded devices have been challenged due to two main reasons: 1) the large model size for storage, and 2) the large memory bandwidth for inference. To address these issues, this paper develops a deep neural network compression framework to reduce the resource usage for efficient visual inference. By reviewing the trained deep model, we propose...
Given a pre-registered 3D mesh sequence and accompanying phoneme-labeled audio, our system creates an animatable face model and a mapping procedure to produce realistic speech animations for arbitrary speech input. Mapping of speech features to model parameters is done using random forests for regression. We propose a new speech feature based on phonemic labels and acoustic features. The novel feature...
This paper is part of a larger effort to detect manipulations of video by searching for and combining the evidence of multiple types of inconsistencies between the audio and visual channels. Here, we focus on inconsistencies between the type of scenes detected in the audio and visual modalities (e.g., audio indoor, small room versus visual outdoor, urban), and inconsistencies in speaker identity tracking...
This paper addresses the problem of automatically inferring personality traits of people talking to a camera. As in many other computer vision problems, Convolutional Neural Networks (CNN) models have shown impressive results. However, despite of the success in terms of performance, it is unknown what internal representation emerges in the CNN. This paper presents a deep study on understanding why...
Current state-of-the-art mesh quality measures evaluate closed and complete meshes obtained after mesh postprocessing applications, such as mesh simplification or watermarking, and compare them against the corresponding reference mesh. Emerging 3D immersive VR/AR applications use noisy 3D point cloud, typically from single RGB-D camera (such as Microsoft's Kinect) to generate standalone (no reference)...
We describe an end-to-end system for explainable automatic job candidate screening from video CVs. In this application, audio, face and scene features are first computed from an input video CV, using rich feature sets. These multiple modalities are fed into modality-specific regressors to predict apparent personality traits and a variable that predicts whether the subject will be invited to the interview...
Thermal image has many applications on image processing such as human detection, face recognition and physiological signal evaluation, etc. The respiratory rate is an important physiological signal, and it is highly related to emotion and some diseases. Therefore, we propose a non-contact method to estimate the respiratory rate from thermal image in this paper. Thermal image can provide the information...
Recently, visual features extracted by convolutional neural networks (CNNs) have been widely used in computer vision. Most state-of-the-art CNNs adopt a convolutional layer to map the high dimensional features into the number of the output classes. However, it is not good enough for feature similarity comparison. So we propose a new layer, Euclidean output layer, for extracting discriminative features...
In an Information Society where the value of news content in the network is ephemeral and disposable, this article intends to present the issues of understanding and sustainability of information through the informative agglomerating competences of Online Journalistic Infographics (in particular, the cross-linguistic and multiplex narrative), from depth, laterality and informative timelessness and...
This study seeks to assess the impact of photography and social networks, checking how the photographs online are perceived by users of social networks. With the growing use of digital devices with internet access alongside with the democratization of photography, in digital format, it has been noticed an increase in the amount of photographs available online through social networks where the common...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.