The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Most of computer vision focuses on what is in an image. We propose to train a standalone object-centric context representation to perform the opposite task: seeing what is not there. Given an image, our context model can predict where objects should exist, even when no object instances are present. Combined with object detection results, we can perform a novel vision task: finding where objects are...
A human action can be seen as transitions between ones body poses over time, where the transition depicts a temporal relation between two poses. Recognizing actions thus involves learning a classifier sensitive to these pose transitions as well as to static poses. In this paper, we introduce a novel method called transitions forests, an ensemble of decision trees that both learn to discriminate static...
We propose local binary convolution (LBC), an efficient alternative to convolutional layers in standard convolutional neural networks (CNN). The design principles of LBC are motivated by local binary patterns (LBP). The LBC layer comprises of a set of fixed sparse pre-defined binary convolutional filters that are not updated during the training process, a non-linear activation function and a set of...
We propose a novel crowd counting model that maps a given crowd scene to its density. Crowd analysis is compounded by myriad of factors like inter-occlusion between people due to extreme crowding, high similarity of appearance between people and background elements, and large variability of camera view-points. Current state-of-the art approaches tackle these factors by using multi-scale CNN architectures,...
Over the past few years, softmax and SGD have become a commonly used component and the default training strategy in CNN frameworks, respectively. However, when optimizing CNNs with SGD, the saturation behavior behind softmax always gives us an illusion of training well and then is omitted. In this paper, we first emphasize that the early saturation behavior of softmax will impede the exploration of...
Novelty detection, which aims to determine whether a given data belongs to any category of training data or not, is considered to be an important and challenging problem in areas of Pattern Recognition, Machine Learning, etc. Recently, kernel null space method (KNDA) was reported to have state-of-the-art performance in novelty detection. However, KNDA is hard to scale up because of its high computational...
Image synthesis is a widely researched topic with applications in image analogy, image inpainting and restoration. In this paper we present a synthesis of image patterns technique to effectively reflect the characteristics of real scenery and texture by using two generative ConvNets models and unsupervised joint learning. The first generative net is a bottom-up ConvNet that maps the image to the features...
Part-based image classification aims at representing categories by small sets of learned discriminative parts, upon which an image representation is built. Considered as a promising avenue a decade ago, this direction has been neglected since the advent of deep neural networks. In this context, this paper brings two contributions: first, this work proceeds one step further compared to recent part-based...
We present a principled approach to uncover the structure of visual data by solving a novel deep learning task coined visual permutation learning. The goal of this task is to find the permutation that recovers the structure of data from shuffled versions of it. In the case of natural images, this task boils down to recovering the original image from patches shuffled by an unknown permutation matrix...
Linking two data sources is a basic building block in numerous computer vision problems. Canonical Correlation Analysis (CCA) achieves this by utilizing a linear optimizer in order to maximize the correlation between the two views. Recent work makes use of non-linear models, including deep learning techniques, that optimize the CCA loss in some feature space. In this paper, we introduce a novel, bi-directional...
A major open problem on the road to artificial intelligence is the development of incrementally learning systems that learn about more and more concepts over time from a stream of data. In this work, we introduce a new training strategy, iCaRL, that allows learning in such a class-incremental way: only the training data for a small number of classes has to be present at the same time and new classes...
While convolutional neural networks have gained impressive success recently in solving structured prediction problems such as semantic segmentation, it remains a challenge to differentiate individual object instances in the scene. Instance segmentation is very important in a variety of applications, such as autonomous driving, image captioning, and visual question answering. Techniques that combine...
Nowadays, applications based on digits recognition and characters recognition have become much more reliable thanks to the rapid development of the DNN(deep neural network) architecture and constantly increasing the efficiency to the computing resources. A lot of methods have been proposed to improve the performance of DNNs, such as the ReLU (Rectified Linear Unit) which is a widely used alternative...
We introduce a novel technique for knowledge transfer, where knowledge from a pretrained deep neural network (DNN) is distilled and transferred to another DNN. As the DNN performs a mapping from the input space to the output space through many layers sequentially, we define the distilled knowledge to be transferred in terms of flow between layers, which is calculated by computing the inner product...
The increasing demand for high image quality in mobile devices brings forth the need for better computational enhancement techniques, and image denoising in particular. To this end, we propose a new fully convolutional deep neural network architecture which is simple yet powerful and achieves state-of-the-art performance for additive Gaussian noise removal. Furthermore, we claim that the personal...
In many sports, it is useful to analyse video of an athlete in competition for training purposes. In swimming, stroke rate is a common metric used by coaches; requiring a laborious labelling of each individual stroke. We show that using a Convolutional Neural Network (CNN) we can automatically detect discrete events in continuous video (in this case, swimming strokes). We create a CNN that learns...
This paper is an approach for pedestrian detection and tracking with infrared imagery. The detection phase is performed by AdaBoost algorithm based on Haar-like features. AdaBoost classifier is trained with datasets generated from infrared images. The number of negative images used for training with AdaBoost algorithm is 3000. For positive training, 1000 samples are used After detecting the pedestrian...
Human action recognition from skeletal data is a hot research topic and important in many open domain applications of computer vision, thanks to recently introduced 3D sensors. In the literature, naive methods simply transfer off-the-shelf techniques from video to the skeletal representation. However, the current state-of-the-art is contended between to different paradigms: kernel-based methods and...
In this paper we proposed a 4-stage coarse-to-fine framework to tackle the facial landmark localization problem in-the-wild. In our system, we first predict the landmark key points on a coarse level of granularity, which sets a good initialization for the whole framework. Then we group the key points into several components and refine each component with local patches cropped within them. After that...
Automated affective computing in the wild is a challenging task in the field of computer vision. This paper presents three neural network-based methods proposed for the task of facial affect estimation submitted to the First Affect-in-the-Wild challenge. These methods are based on Inception-ResNet modules redesigned specifically for the task of facial affect estimation. These methods are: Shallow...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.