The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper proposes three simple, compact yet effective representations of depth sequences, referred to respectively as Dynamic Depth Images (DDI), Dynamic Depth Normal Images (DDNI) and Dynamic Depth Motion Normal Images (DDMNI). These dynamic images are constructed from a sequence of depth maps using bidirectional rank pooling to effectively capture the spatial-temporal information. Such image-based...
This paper addresses the problem of continuous gesture recognition from sequences of depth maps using Convolutional Neural networks (ConvNets). The proposed method first segments individual gestures from a depth sequence based on quantity of movement (QOM). For each segmented gesture, an Improved Depth Motion Map (IDMM), which converts the depth sequence into one image, is constructed and fed to a...
Human gesture recognition is one of the central research fields of computer vision, and effective gesture recognition is still challenging up to now. In this paper, we present a pyramidal 3D convolutional network framework for large-scale isolated human gesture recognition. 3D convolutional networks are utilized to learn the spatiotemporal features from gesture video files. Pyramid input is proposed...
The gesture recognition has raised attention in computer vision owing to its many applications. However, video-based large-scale gesture recognition still faces many challenges, since many factors like background may disturb the accuracy. To achieve gesture recognition with large-scale videos, we propose a method based on RGB-D data. To learn gesture details better, the inputs are expanded into 32-frame...
In this paper, we tackle the continuous gesture recognition problem with a two streams Recurrent Neural Networks (2S-RNN) for the RGB-D data input. In our framework, the spotting-recognition strategy is used, that means the continuous gestures are first segmented into separated gestures, and then each isolated gesture is recognized by using the 2S-RNN. Concretely, the gesture segmentation is based...
In this paper, we focus on describing the method we designed for automatic perceived personality prediction. We present a simple model that uses three different sets of features: nonverbal audio cues, visual cues from video, and facial landmark points. The model uses a random decision forest to do regression from the extracted features. As we discuss in Section 4, this multimodal model performs relatively...
Affective computing, particularly emotion and personality trait recognition, is of increasing interest in many research disciplines. The interplay of emotion and personality shows itself in the first impression left on other people. Moreover, the ambient information, e.g. the environment and objects surrounding the subject, also affect these impressions. In this work, we employ pre-trained Deep Convolutional...
In this paper, we propose using 3D Convolutional Neural Networks for large scale user-independent continuous gesture recognition. We have trained an end-to-end deep network for continuous gesture recognition (jointly learning both the feature representation and the classifier). The network performs three-dimensional (i.e. space-time) convolutions to extract features related to both the appearance...
The task of the ChaLearn Apparent Personality Analysis: First Impressions Challenge is to rate/quantify personality traits of users in short video sequences. Although the validity of personality judgments from short interactions is questionable, studies show the possibility of predicting attributed traits (First Impressions) using facial [15] and acoustic [13] features. The challenge introduces a...
In this paper is presented a novel multimodal emotion recognition system which is based on the analysis of audio and visual cues. MFCC-based features are extracted from the audio channel and facial landmark geometric relations are computed from visual data. Both sets of features are learnt separately using state-of-the-art classifiers. In addition, we summarise each emotion video into a reduced set...
As different staining patterns of HEp-2 cells indicate different diseases, the classification of Indirect Immune Fluorescence (IIF) images on Human Epithelial-2 (HEp-2) cell is important for clinical applications. Different from traditional pattern recognition techniques, we use CNN to extract more high-level features for cell images classification. Compared to the existing CNN based HEp-2 classification...
Human Epithelial type-2 (HEp-2) cells are used as substrates for the detection of Anti Nuclear Antibodies (ANA) in the Indirect Immunofluorescence (IIF) test to diagnose autoimmune diseases. Pathologists in the laboratory examine the IIF slides to detect and recognize theHEp-2 cell patterns to generate the report. So, the IIF test is subjective and requires objective analysis. This paper introduces...
Reliable automatic system for Human Epithelial-2 (HEp-2) cell image classification can facilitate the diagnosis of systemic autoimmune diseases. In this paper, an automatic pattern recognition system using fully convolutional network (FCN) was proposed to address the HEp-2 specimen classification problem. The FCN in the proposed framework was adapted from VGG-16, which was trained with ICPR 2016 dataset...
This paper summarizes the proposal submitted by the joint team conformed by researchers from UPV and ULPGC to the Mobile Iris CHallenge Evaluation II. The approach makes use of a state-of-the-art iris segmentation technique, to later extract features making use of local descriptors. Those suitable to the problem are selected after evaluating a collection of 15 local descriptors, covering a range of...
3D-point set registration is an active area of research in computer vision. In recent years, probabilistic registration approaches have demonstrated superior performance for many challenging applications. Generally, these probabilistic approaches rely on the spatial distribution of the 3D-points, and only recently color information has been integrated into such a framework, significantly improving...
The intrinsic interactions among a video's emotion tag, its content, and a user's spontaneous response while consuming the video can be leveraged to improve video emotion tagging, but this capability has not been thoroughly exploited yet. In this paper, we propose an implicit hybrid video emotion tagging approach by integrating video content and users' multiple physiological responses, which are only...
Current research of emotion recognition from electroencephalogram (EEG) signals rarely considers common patterns embodied in multiple subjects and individual patterns for each subject simultaneously. Therefore, in this paper, we propose a novel emotion recognition approach using subjects or subject groups as privileged information, which is only available during training. First, five frequency features...
Although emotional state recognition from voice has been extensively studied, there is not much effort focusing on the online emotion recognition. Since duration and intensity of emotional experiences change over time it is hard to employ existing static transition models while monitoring emotional states especially in an online setting. To overcome this difficulty we introduce a method which incorporates...
Face alignment is an important issue in many computer vision problems. The key problem is to find the nonlinear mapping from face image or feature to landmark locations. In this paper, we propose a novel cascaded approach with bidirectional Long Short Term Memory (LSTM) neural networks to approximate this nonlinear mapping. The cascaded structure is used to reduce the complexity of this problem and...
Human gait is an important biometric feature for person identification in surveillance videos because it can be collected at a distance without subject cooperation. Most existing gait recognition methods are based on Gait Energy Image (GEI). Although the spatial information in one gait sequence can be well represented by GEI, the temporal information is lost. To solve this problem, we propose a new...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.