The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Recognising detailed clothing characteristics (finegrained attributes) in unconstrained images of people inthe-wild is a challenging task for computer vision, especially when there is only limited training data from the wild whilst most data available for model learning are captured in well-controlled environments using fashion models (well lit, no background clutter, frontal view, high-resolution)...
We propose to leverage concept-level representations for complex event recognition in photographs given limited training examples. We introduce a novel framework to discover event concept attributes from the web and use that to extract semantic features from images and classify them into social event categories with few training examples. Discovered concepts include a variety of objects, scenes, actions...
In this work, we decompose a first-person action into verb and noun. We then study how the coupling of an action's constituent verb and noun affects the learners' ability to learn them separately and to combine them to perform recognition. We compare different information fusion methods on conventional action recognition and zero-shot learning, of which the latter is a strong indication of the feature's...
Deep Convolutional Neural Networks - also known as DCNN - are powerful models for different visual pattern classification problems. Many works in this field use image augmentation at the training phase to achieve better accuracy. This paper presents blocky artifact as an augmentation technique to increase the accuracy of DCNN for handwritten digit recognition, both English and Bangla digits, i.e.,...
Detecting actions or verbs in still images is a challenging problem for a variety of reasons such as the absence of temporal information and polysemy of verbs which lead to difficulty in generating large verb datasets. In this paper, we propose to first detect the prominent objects in the image and then infer the relevant actions or verbs using Natural Language Processing (NLP)-based techniques. The...
This paper presents a methodology for recognition of handwritten Marathi and English Characters-Numerals using shape context descriptor. During pre-processing an algorithm is developed to extract the Marathi and English Characters-Numerals form grid formatted datasheets. The corresponding sample points around the boundary of a character are computed. This is followed by obtaining the centroid of the...
The Fine-grained Vehicle recognition is easily affected by small visual changes. The existing recognition methods have less robustness to these conditions (such as illumination, weather changes, etc.) and the accuracy of vehicle recognition in complex environments cannot achieve a satisfying result. In this paper, a high-accuracy fine-grained vehicle recognition method using Convolutional Neural Network...
Hereby in this paper, we are interested to extraction methods and classification in case of image classification and recognition application. We expose the performance of training models on varying classifier algorithms on Caltech 101 images categories. For feature extraction functions we evaluate the use of the classical SURF technique against global color feature extraction. The purpose of our work...
We propose mutually incoherent pose bases for action recognition in static image, each of which implicitly represents co-occurrence of poselets. First of all, action specific poselets are trained. To suppress the ambiguity of detection, we cluster poselet activations by the overlap of predicted torso bound of each poselet. Then pose feature of an action person can be extracted which is a vector composed...
This paper presents fine-tuned CNN features for person re-identification. Recently, features extracted from top layers of pre-trained Convolutional Neural Network (CNN) on a large annotated dataset, e.g., ImageNet, have been proven to be strong off-the-shelf descriptors for various recognition tasks. However, large disparity among the pre-trained task, i.e., ImageNet classification, and the target...
Deep Neural Networks have become increasingly popular due to their efficient realization in GPU hardware. Problems that were once considered computationally intensive to implement using Neural networks have now become possible due to the vast amount of flexibility and capability offered by the GPU and Deep networks combination. In this work, we attempt to improve the recognition rate for images, using...
Often deep learning methods are associated with huge amounts of training data. The deeper the network gets, the larger is the need for training data. A large amount of labeled data helps the network learn about the variations it needs to handle in the prediction stage. It is not easy for everyone to get access to huge amounts of labeled data leaving a few to have the luxury to design very deep networks...
Recognition of dominant planes is an important task used in areas such as robot navigation, augmented reality, 3D reconstruction, among others. There are several approaches for recognizing planar structures, however, most of these approaches are based on processing two or more images captured from different camera views or on processing 3D data in the form of point clouds associated with the camera...
In this paper we propose to face the problem of event detection from single images, by exploiting both background information often containing revealing contextual clues and details, which are salient for recognizing the event. Such details are visual objects critical to understand the underlying event depicted in the image and were recently defined in the literature as “event-saliency”. Adopting...
In this paper, we proposed a novel framework for facial expression recognition, in which face images were taken as vertices in a hypergraph and the task of expression recognition was formulated as the problem of hypergraph based inference. A hybrid strategy was developed to construct hyperedges: we generated probabilities of facial action units by deep convolutional networks and took each action unit...
In this paper a novel CNN-based approach in the Content Based Image Retrieval domain that exploits supervised learning is proposed. We employ a deep CNN model to derive feature representations from the activations of the deepest layers and we refine the weights of the utilized layers in order to produce better image descriptors using information obtained from the available data labels. To this end,...
In recent years, growing attention has been paid to recognizing text in natural scenes images. Scene Character recognition (SCR) is an important step in automatizing the process of reading text in natural scenes.
The development of automatic nutrition diaries, which would allow to keep track objectively of everything we eat, could enable a whole new world of possibilities for people concerned about their nutrition patterns. With this purpose, in this paper we propose the first method for simultaneous food localization and recognition. Our method is based on two main steps, which consist in, first, produce...
Natural scene text recognition has proved to be challenging due to the unconstrained wild conditions. In this paper, to solve this problem we propose a method which first detects and recognizes characters by utilizing the high performance Convolutional Neural Network (CNN). Then for post-processing, inspired by its success in speech recognition, we employ the efficient and flexible Weight Finite State...
In this paper, we report a work consisting in using deep convolutional neural networks (CNNs) for curating and filtering photos posted by social media users (Instagram and Twitter). The final goal is to facilitate searching and discovering user-generated content (UGC) with potential value for digital marketing tasks. The images are captured in real time and automatically annotated with multiple CNNs...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.