The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This study aims to determine how the visual character of wayang kulit (shadow puppet) represented into the wayang-themed game in Indonesia. Research using visual language theory with case studies of two wayang-themed games titled “Mahabharat Warrior” and “Kurusetra”. The theory of visual language is used to determine how the visual representation of the form of wayang that use the Space-Time-Plane...
With over one hundred thousand games apps, it's important to make a mobile game app stand out. The most basic way to do it is grabbing people attention through the icon. It's the first thing people see in apps store. It represents the face of the game. The mobile game app icon must communicate the right impression and the right message. This research studies the top grossing mobile games apps icons...
In this paper, we address the 3D eye gaze estimation problem using a low-cost, simple-setup, and non-intrusive consumer depth sensor (Kinect sensor). We present an effective and accurate method based on 3D eye model to estimate the point of gaze of a subject with the tolerance of free head movement. To determine the parameters involved in the proposed eye model, we propose i) an improved convolution-based...
This paper proposes a lightweight deep model to recognize age and gender from a face image. Though simple, our network architecture is able to complete the two tasks effectively and efficiently. Moreover, different from existing methods, we simultaneously perform the age and gender recognition tasks via a joint regression model. Specifically, our model employs a multi-task learning scheme to learn...
Judgments about personality based on facial appearance are strong effectors in social decision making, and are known to have impact on areas from presidential elections to jury decisions. Recent work has shown that it is possible to predict perception of memorability, trustworthiness, intelligence and other attributes in human face images. The most successful of these approaches require face images...
Real-time detection of a speaker and speaker's location is a challenging task, which is usually addressed by processing acoustic/visual information. However, it is a well-known fact that when a person speaks, the lip and head movements can also be used to detect the speaker and location. This paper proposes a speaker detection system using visual prosody information (e.g. head and lip movements) in...
Affective computing, particularly emotion and personality trait recognition, is of increasing interest in many research disciplines. The interplay of emotion and personality shows itself in the first impression left on other people. Moreover, the ambient information, e.g. the environment and objects surrounding the subject, also affect these impressions. In this work, we employ pre-trained Deep Convolutional...
In this paper we present a novel audio-visual (AV) person identification system based on joint sparse representation. Video features used were vectorized raw pixel values, while i-vectors were used as the audio features. Classification is performed by solving the joint sparsity optimization problem, and fusion is carried out by using the quality (confidence) assigned to each matcher. Our experimental...
In this paper is presented a novel multimodal emotion recognition system which is based on the analysis of audio and visual cues. MFCC-based features are extracted from the audio channel and facial landmark geometric relations are computed from visual data. Both sets of features are learnt separately using state-of-the-art classifiers. In addition, we summarise each emotion video into a reduced set...
Most developments in speech-based automatic recognition have relied on acoustic speech as the sole input signal, disregarding its visual counterpart. However, recognition based on acoustic speech alone can be afflicted with deficiencies that prevent its use in many real-world applications, particularly under adverse conditions. This paper aims to build a connected-words audio visual speech recognition...
Facial expression recognition in complex environment is one of the difficult tasks of visual recognition in recent years. This paper introduces the visual saliency mechanism and we design automatic searching of the face region in the image. Using the narrow band C-V model to evolve curve, the proposed scheme can obtain the accurate face region. Meanwhile, the SVM will be trained by standard database...
Face and mouth localization are the vital phase for visual speech recognition. These tasks refer to the detection of the face and mouth region within the viseme images. The main problem in face and mouth localization is the constraints on the image such as rotation of the images and color of the homogeneity intensity within the images and some parts of the object images cannot be detected. This paper...
We tackle the problem of reducing the false positive rate of face detectors by applying a classifier after the detection step. We first define and study this post classification problem. To this end, we first consider the multiple-stage cascade structure which is the most common face detection architecture. Here, each cascade stage aims to solve a binary classification problem, denoted the Face/non-Face...
The paper deals with the Cloud Based solution and experience using Microsoft Azure Emotion Assessment Software as a Service embedded for application for teachers. The project was also about testing an assessment on the selected database and accuracy this process was calculated. The paper report about the problems related to camera lighting and resolution of the audience. The testing Azure platform...
We present the first demonstration of end-to-end far-to-near situated interaction between an uninstrumented human user and an initially distant outdoor autonomous Unmanned Aerial Vehicle (UAV). The user uses an arm-waving gesture as a signal to attract the UAV's attention from a distance. Once this signal is detected, the UAV approaches the user using appearance-based tracking until it is close enough...
This workshop considers to share some experiences of educational practice of technologies used in English courses at the Technological Institute of Costa Rica, known in Spanish by its abbreviation (ITCR) throughout three years. There are three cases chosen using technological tools such as Facebook, Google Plus and the blog. Although courses at the ITCR are face to face, some technological components...
In this paper we introduce CoBoard1 Flowers, an audio-visual public installation for Prague Spring International Music Festival. An interactive visualization with floral motifs is shown and is controlled by a face detection technology, enabled by a computer vision algorithm. The visualization is controlled by the estimated position, distance, gender, and age of the participants. The visualization...
This paper presents a multi-channel/multi-speaker 3D audio-visual corpus for Mandarin continuous speech recognition and other fields, such as speech visualization and speech synthesis. This corpus consists of 24 speakers with about 18k utterances, about 20 hours in total. For each utterance, the audio streams were recorded by two professional microphones in near-field and far-field respectively, while...
In a higher educational institute, enhancing the educational process and achieving a high quality of education will help decision makers in a better management of resources of the educational institutes. Jordan universities applied for national and international higher education accreditation, whose one of evaluation criteria depends on students' performance and course delivery process. Quality of...
Audiovisual speech synchrony detection is an important liveness check for talking face verification systems to make sure that the (pre-defined) content and timing of the given audible and visual speech samples match. Nowadays, there exists virtually no technical limitations for combining transferable facial animation and voice conversion (or synthesis) to create an ultimate audiovisual artifact that...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.