The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper an efficient method for image retargeting is proposed. It relies on a monte-carlo model that makes use of image saliency. Each random sample is extracted from deformation probability mass function defined properly, and shrinks or enlarges the image by a fixed size. The shape of the function, determining which regions of the image are affected by the deformations, depends on the image...
Unconstrained face recognition remains a challenging computer vision problem despite recent exceptionally high results (∼ 95% accuracy) on the current gold standard evaluation dataset: Labeled Faces in the Wild (LFW) (Huang et al., 2008; Chen et al., 2013). We offer a decomposition of the unconstrained problem into subtasks based on the idea that invariance to identity-preserving transformations is...
In this paper we present a novel method for the automatic analysis of mobile eye-tracking data in natural environments. Mobile eye-trackers generate large amounts of data, making manual analysis very time-consuming. Available solutions, such as marker-based analysis minimize the manual labour but require experimental control, making real-life experiments practically unfeasible. We present a novel...
Public speaking is a non-trivial task since it is affected by how nonverbal behaviors are expressed. Practicing to deliver the appropriate expressions is difficult while they are mostly given subconsciously. This paper presents our empirical study on the nonverbal behaviors of presenters. Such information was used as the ground truth to develop an intelligent tutoring system. The system can capture...
In this paper we present a novel system to extract keyframes, shot clusters and structural storyboards for video content description, which can be used for a variety of summarization, visualization, classification, indexing and retrieval applications. The system automatically selects an appealing set of keyframes and creates meaningful clusters of shots. It further identifies sections that appear...
Movie summarization aims at condensing a full-length movie to a significantly shortened version that still preserves the movie's major semantic content. In this paper, we propose a learning-based movie summarization framework via role-community social network analysis and feature fusion. In our framework, scene-based movie summarization is formulated as a 0–1 knapsack problem, where the scene attention...
This document presents a study regarding the potential of head movement in lie detection. The potential was analyzed using a non-invasive technique that detects the head movement out of video. In the literature there is a lot of information regarding lie indicators and for this reason we made a short review of them. Application was built in order to detect the head movement and head position by performing...
The study investigates the gender disparity in video face replacement perception. This is inspired by the prevalent face replacement techniques being utilised in the film making industry as well as the existence of gender disparity in various aspects of human life. A user study was conducted which contained quality rating task of face replacement videos. Results show that there is a significant difference...
Face More, a cloud-based face beautification platform for intelligent face manipulation, is developed in this work. It provides flexible and efficient cloud API to develop automatic or interactive face retouching applications. A web-site, www.facemore.net, is built on Face More, where user can upload images and obtain various online face beautification services. To obtain automatic inhomogeneous editing...
Emoticon is often used in short messages to shortly describe actions, feelings and so on. That can also represent sentimental intention such that it's difficult to describe by only language. Recently the sentiment analysis has been focused in cases of election, economic market and so on. Consideration of emoticon is also useful in such cases and in the first place, emoticon extraction from text is...
In this paper we propose synchronization rules between acoustic and visual laughter synthesis systems. Previous works have addressed separately the acoustic and visual laughter synthesis following an HMM-based approach. The need of synchronization rules comes from the constraint that in laughter, HMM-based synthesis cannot be performed using a unified system where common transcriptions may be used...
Depression is a typical mood disorder, which affects people in mental and even physical problems. People who suffer depression always behave abnormal in visual behavior and the voice. In this paper, an audio visual based multimodal depression scale prediction system is proposed. Firstly, features are extracted from video and audio are fused in feature level to represent the audio visual behavior....
We give an overview of engagement in human-agent interaction. We discuss the different definitions of engagement in human and social science, specify how they relate to certain other concepts, and give an overview of the high level behaviour that is often associated with engagement. This work serves to position our future research on engagement in human-agent interaction.
In this paper, we study perception of intensity in-congruence between auditory and visual modalities of synthesized expressions of laughter. In particular, we investigate whether incongruent expressions are perceived as 1) regulated, and 2) unsuccessful in terms of animation synthesis. For this purpose, we conducted a perceptive study with the use of a virtual agent. Congruent and incongruent multimodal...
How efficiently decoding affective information when computational resources and sensor systems are limited? This paper presents a framework for analysis of affective behavior starting with a reduced amount of visual information related to human upper-body movements. The main goal is to individuate a minimal representation of emotional displays based on non-verbal gesture features. The GEMEP (Geneva...
We propose an autism spectrum disorder (ASD) prediction system based on machine learning techniques. Our work features the novel development and application of machine learning methods over traditional ASD evaluation protocols. Specifically, we are interested in discovering the latent patterns that possibly indicate the symptom of ASD underneath the observations of eye movement. A group of subjects...
In this paper, we address the problematic of automatic detection of engagement in multi-party Human-Robot Interaction scenarios. The aim is to investigate to what extent are we able to infer the engagement of one of the entities of a group based solely on the cues of the other entities present in the interaction. In a scenario featuring 3 entities: 2 participants and a robot, we extract behavioural...
It is easy for human beings to discern whether an observed acoustic signal is a direct speech, reflected speech or noise through simple listening. Relying purely on acoustic cues is enough for human beings to discriminate between the different kinds of sound sources which is not straightforward for machines. A robot equipped with the current robot audition mechanism in most cases, will fail to differentiate...
Infants' visual recognition abilities are typically studied using variations of preferential looking paradigms. In this broad class of tasks, the extent to which infants discriminate between, categorize, and recognize complex images is determined by which of two test images they prefer to look at. This preference is usually expressed by calculating the proportion of total looking time allocated to...
Infants' processing of adult social cues develops late in the first year. Sensitivity before 6 months is limited to nonspecific motion-cuing by lateral eye movements. Results from naturalistic and experimental studies show that learning is sensitive to factors including target location, target salience, gaze-cue salience, and the presence of distractors or non-gaze social cues. Those results are consistent...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.