The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The deployments of deep neural network models on mobile or embedded devices have been hindered due to their large number of weights. In this work, we develop a deep neural network (DNN) model compression service termed MicroBrain to reduce the resource usage for energy-efficient visual inference. By automatically analyzing the trained DNN models, we propose a high-performance DNN model compression...
As a new biometric, the Electroencephalogram (EEG) signal has the advantages of invisibility, non-clonability, and non-coercion compare to traditional biometrics. However, the real-time and stability are the difficulties that the current EEG-based person authentication systems face. In this paper, we design a real-time and stable person authentication system using EEG signals, which are elicited by...
Mobile devices can cause visual discomfort and even injuries to the eyes owing to shaking and vibration. In this study, we aim to develop a real-time visual tracking technique for mobile display stabilization in order to provide users with a comfortable interaction experience and reduce the hurt caused by screen vibration. The workflow of this mechanism includes tracking the motion of the mobile device,...
Emotions are related to many different parts of our lives: from the perception of the environment around us to different learning processes and natural communication. Therefore, it is very hard to achieve an automatic emotion recognition system which is adaptable enough to be used in real-world scenarios. This paper proposes the use of a growing and self-organizing affective memory architecture to...
Several regions in the ventral-temporal cortex of the human brain are thought to have representations of specific categories of objects. Furthermore, a distributed network of frontal and parietal brain regions is implicated in attentional control. It is assumed that during visual search, attention-control regions send top-down signals to the target category-selective areas to bias the processing in...
It is well known that speech recognition is a multimodal process which uses information not only from audio but also from vision. This paper describes our experience to design an audio visual speech recognition system, which relates the acoustic and the visual information in order to improve noise robustness of automatic speech recognition. The accuracy rate for face and mouth detection using Viola-Jones...
New pedagogical methods delivered through mobile mixed reality (via a user-supplied mobile phone incorporating 3d printing and augmented reality) are becoming possible in distance education, shifting pedagogy from 2D images, words and videos to interactive simulations and immersive mobile skill training environments. This paper presents insights from the implementation and testing of a mobile mixed...
In this paper we study the measurability and variability of manually annotated characteristic descriptors on a forensic relevant face dataset. Characteristic descriptors are facial features (landmarks, shapes, etc.) that can be used during forensic case work. With respect to measurability, we observe that a significant proportion cannot be determined in images representative of forensic case work...
Face Super Resolution(FSR) is to infer High Resolution(HR) facial images from given Low Resolution(LR) ones with the assistance of LR and HR training pairs. Among existing methods, local patch based methods are superior in visual and objective quality than global based methods. These local patch based methods are based on the consistency assumption that the neighbors in HR/LR space form similar local...
Stochastic neighbor embedding (SNE) aims to transform the observations in high-dimensional space into a low-dimensional space which preserves neighbor identities by minimizing the Kullback-Leibler divergence of the pairwise distributions between two spaces where Gaussian distributions are assumed. Data visualization could be improved by adopting the t-SNE where Student t distribution is used in the...
The performance of local descriptors such as SIFT drops under severe illumination changes. In this paper, we propose a Discriminative and Contrast Invertible (DCI) local feature descriptor. In order to increase the discriminative ability of the descriptor under illumination changes, a Laplace gradient based histogram is proposed. Moreover, a robust contrast flipping estimate is proposed based on the...
Speechreading is a notoriously difficult task for humans to perform. In this paper we present an end-to-end model based on a convolutional neural network (CNN) for generating an intelligible acoustic speech signal from silent video frames of a speaking person. The proposed CNN generates sound features for each frame based on its neighboring frames. Waveforms are then synthesized from the learned speech...
Long videos captured by consumers are typically tied to some of the most important moments of their lives, yet ironically are often the least frequently watched. The time required to initially retrieve and watch sections can be daunting. In this work we propose novel techniques for summarizing and annotating long videos. Existing video summarization techniques focus exclusively on identifying keyframes...
This paper bring the decision about the problem facing by the visual impaired person. Here, We designed the device to system for the visually impaired person to handle problem in the environment. They face difficulties in independent accessing public transport since they cannot read the route number and unsure about the physical location of the bus, identifying the person, and also they find difficulty...
Due to the various factors like expression, pose, illumination and accessory variation etc., human face seem different in multiple occasions. To determine the efficiency of the different face recognition algorithms, it requires benchmark face images. In this paper, we presented a comprehensive study of the available 2D face databases and also introduces the creation of a visual face database, Xinjiang...
Face recognition system is used for the identification and verification of a face from a video or digital image. In the first phase, Viola Jones algorithm is used to detect and crop face region automatically from image/video frame. The second phase is to recognize the face of a person, in our proposed method Bag of Word technique used to extract features from an image which uses SURF for interest...
The link between object perception and neural activity in visual cortical areas is a problem of fundamental importance in neuroscience. We measured brain surface physiology with implanted electrocorticography (ECoG) electrodes in humans. Physiological responses to visual stimuli in object-specific ventral temporal loci are highly polymorphic in different cortical loci, for both broadband and raw potential...
Autism spectrum disorder (ASD) is one of the most common childhood developmental disorders. Early detection and intervention for ASD are critical for increasing child success. In the past decade, utilizing the abnormal eye gaze characteristics of children with autism in regard to certain visual stimuli is emerging as a screening approach due to its cost-efficiency and promising accuracy. However,...
In this paper is presented the integration of diverse modules for people fallen detection by a mobile service robot. This integration has been achieved in the middleware ROS (Robotics Operation System). The proposed implementation are arranged over an modular architecture of three layers: Hardware, Processing and Decision. The modules implemented are on the processing layer. The first module uses...
The scatter form of multimedia data such as text, image, audio, and video posted regularly in the social media may contain useful information for the organizations. But, this information should be derived with the use of some form of analysis known as Multimodal Sentiment Analysis (MSA). But, there is a lack of proper analytic tools for such analysis. This paper presents a thorough overview of more...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.