Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
Computational visual atention models aims to emulate the Human Visual System performance in selecting relevant features for efficient visual scene processing. As a result, visual saliency maps highlights relevant visual patterns in an image, possibly associated with objects or specific concepts. In the analysis of medical images, this allows the radiologist or clinical expert to focus the attention...
Presentations are a productive approach to pass the blueprint of a work to the group of onlookers; here the framework is intended to make the presentation slides consequently from PDF document. Such produced slide can be utilized as draft slides. This draft slides will set up their last slide in a simple way, which will spare moderator's opportunity. To make presentation slide automatic, framework...
We present in this paper a real-time method for visual categorization to do robot grasping. We describe an object database with SURF feature points which we quantify with the Kmeans clustering algorithm to make visual words. Then, we train a Support Vector Machine classifier having as entries the distribution of the bag of features extracted earlier. Likewise, we do object recognition using the SVM...
Road planning and traffic monitoring is conducted based on survey of traffic volume. In recent years many of researchers have developed vision and audio based techniques for detection and classification of moving vehicles. Audio based technique suffers from low accuracy but has low computational cost. Then there is visual based approach which has significantly higher accuracy but demands high computational...
Today, great focus has been placed on context-aware human-machine interaction, where systems are aware not only of the surrounding environment, but also about the mental/affective state of the user. Such knowledge can allow for the interaction to become more human-like. To this end, automatic discrimination between laughter and speech has emerged as an interesting, yet challenging problem. Typically,...
In this paper, we propose to learn object representations with inference from temporal correlation in videos to achieve effective visual tracking. Unlike traditional methods which perform feature learning either at image level or based on intuitive temporal constraint, we employ the recurrent network with Long Short Term Memory (LSTM) units to directly learn temporally correlated representations of...
We consider the use of transfer learning, via the use of deep Convolutional Neural Networks (CNN) for the image classification problem posed within the context of X-ray baggage security screening. The use of a deep multi-layer CNN approach, traditionally requires large amounts of training data, in order to facilitate construction of a complex complete end-to-end feature extraction, representation...
The objective of this paper is the fully automated visual identification of individual Holstein Friesian cattle from dorsal RGB-D imagery taken in real-world farm environments. Autonomous and non-intrusive cattle identification could provide an essential tool for economically-viable machinised farming analytics, social monitoring, cattle traceability, food production management and more. We contribute...
We study the problem of scene classification for RGB-D images in this paper. Firstly we analyze the difference between the RGB and depth images. And then based on the difference, an efficient method is implemented to make use of the RGB and depth images and make a well fusion for the RGB and depth features. Focusing on the difference of modality between the RGB and depth images, we propose a method...
“Ceci n'est pas une pipe” French for “This is not a pipe”. This is the description painted on the first painting in the figure above. But to most of us, how could this painting is not a pipe, at least not to the great Belgian surrealist artist Rene Magritte. He said that the painting is not a pipe, but rather an image of a pipe. In this paper, we present a study on large-scale classification of fine-art...
We proposed a novel model to predict human's visual attention when free-viewing webpages. Compared with natural images, webpages are usually full of salient regions such as logos, text, and faces, while few of them attract human's attention in a short sight. Moreover, webpages perform distinct viewing patterns which are quite different from the natural images. In this paper, we introduced multi-features...
This paper presents a novel method of fixation identification for mobile eye trackers. The most significant benefit of our method over the state-of-the-art is that it achieves high accuracy for low-sample-rate devices worn during locomotion. This in turn delivers higher quality datasets for further use in human behaviour research, robotics and the development of guidance aids for the visually impaired...
Curators, art historians, and connoisseurs are often interested in determining the authorship of paintings. Machine learning and image processing techniques can assist in this task by providing non-invasive, automatic, and objective methods. In this work, we study the automatic identification of Vincent van Gogh's paintings using a Convolutional Neural Network that extracts discriminative visual patterns...
We present a novel video representation for human action recognition by considering temporal sequences of visual words. Based on state-of-the-art dense trajectories, we introduce temporal bundles of dominant, that is most frequent, visual words. These are employed to construct a complementary action representation of ordered dominant visual word sequences, that additionally incorporates fine grained...
One of the challenges for real-world image-based surface defect classification task is the lack of labeled training samples to extract useful features to confidently classify defects. In this paper, we present results on our investigation on whether features derived from OverFeat, a variant of Convolution Neural Network, can be used directly for image-based surface defect classification task. We show...
In complex visual recognition systems, feature fusion has become crucial to discriminate between a large number of classes. In particular, fusing high-level context information with image appearance models can be effective in object/scene recognition. To this end, we develop an auto-context modeling approach under the RKHS (Reproducing Kernel Hilbert Space) setting, wherein a series of supervised...
Smile detection in the wild is an interesting and challenging problem. This paper presents an efficient approach with hierarchical visual feature to handle this problem. In our approach, Gabor filters with multi-scale, multi-orientation are first applied to extract facial textures namely Gabor faces from the input face image. After this, Histograms of Oriented Gradients (HOG) are employed to encode...
This paper presents a study on hand gesture distinguish ability between Speeded Up Robust Features(SURF) and Scale Invariant Feature Transform(SIFT) feature descriptors of hand images. Then bag of visual words are to map these descriptors to a dimension vector and support vector machine(SVM) classifer is trained to recognize hand gesture. Experimental results demonstrate that SURF feature descriptors...
In this paper, we present a classification method based on the multi-level brain partitions. Bag-of-visual-words model is used. Firstly, the representative SIFT features are extracted from brain template as the basic visual words. Secondly, individual MR images are described using the basic visual words and support vector machine classifiers are trained for different brain partitions respectively...
Vehicle Logo Recognition(VLR) has been an important study field in intelligent Transportation system (ITS). This paper proposes to recognize vehicle logo and predict logo attributes by combining Convolutional Neural Network (CNN) with Multi-Task Learning(MTL). In order to accelerate convergence of multi-task model, an adaptive weight training strategy is employed. To verify the algorithm, the Xiamen...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.