The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper we propose an online multi-task learning algorithm for video concept detection. In particular, we extend the Efficient Lifelong Learning Algorithm (ELLA) in the following ways: a) we solve the objective function of ELLA using quadratic programming instead of solving the Lasso problem, b) we add a new label-based constraint that considers concept correlations, c) we use linear SVMs as...
In this paper, we propose a novel scheme for automatic recognition of facial expressions captured from both fronto-parallel and non-fronto-parallel cameras i.e., multi-view facial expressions (MVFE). The proposed scheme introduce a Local Saliency-inspired Binary Pattern (LSiBP) feature to recognize MVFE. First view-specific approximated saliency likelihood map (ASLM) is derived during training of...
This paper presents a new approach to crowd behaviour anomaly detection that uses a set of efficiently computed, easily interpretable, scene-level holistic features. This low-dimensional descriptor combines two features from the literature: crowd collectiveness [1] and crowd conflict [2], with two newly developed crowd features: mean motion speed and a new formulation of crowd density. Two different...
Real-world CCTV footage often poses increased challenges in object tracking due to Pan-Tilt-Zoom operations, low camera quality and diverse working environments. Most relevant challenges are moving background, motion blur and severe scale changes. Convolutional neural networks, which offer state-of-the-art performance in object detection, are increasingly utilized to pursue a more efficient tracking...
In this paper, we propose a method to estimate head pose with convolutional neural network, which is trained on synthetic head images. We formulate head pose estimation as a regression problem. A convolutional neural network is trained to learn head features and solve the regression problem. To provide annotated head poses in the training process, we generate a realistic head pose dataset by rendering...
In this paper, we apply One-Class Classification methods in facial image analysis problems. We consider the cases where the available training data information originates from one class, or one of the available classes is of high importance. We propose a novel extension of the One-Class Extreme Learning Machines algorithm aiming at minimizing both the training error and the data dispersion and consider...
Performing dimensionality reduction on features is essential in tackling a majority of large-scale computer vision and pattern recognition problems. The popularity of adopting high-dimensional descriptors has caused conventional techniques such as PCA inefficient or even unfeasible. We introduce an unsupervised deep-net approach, termed as recursive reduction net (RRN), to carrying out dimensionality...
In this paper we consider the problem of semi-supervised learning with deep Convolutional Neural Networks (ConvNets). Semi-supervised learning is motivated on the observation that unlabeled data is cheap and can be used to improve the accuracy of classifiers. In this paper we propose an unsupervised regularization term that explicitly forces the classifier's prediction for multiple classes to be mutually-exclusive...
In many human activity recognition systems the size of the unlabeled training data may be significantly large due to expensive human effort required for data annotation. Moreover, the insufficient data collection process from heterogenous sources may cause dissimilarities between training and testing data. To address these limitations, a novel probabilistic approach that combines learning using privileged...
With the advent of cost-effective depth sensors and the development of fast human-pose estimation algorithms, interest in action recognition from temporal skeleton sequences has been renewed. In this work we claim the task can be naturally seen as a Multiple Instance Learning (MIL) problem. Specifically, we model skeleton sequences as bags of time-stamped descriptors, and we present a new framework...
In this paper, a novel progressive strategy is proposed to teach the machine to accomplish face detection in the wild. Firstly, deep model named Fully-connected Face Classifier (FCFC) is built up. With the targeted training data, FCFC learns the knowledge corresponding to distinguish face in various pose, facial expression, occlusion proportion, and blur degree from background gradually. Secondly,...
Computationally transcribing historical document images to digital text often requires an initial, labor intensive recording of ground-truths by language experts to provide the OCR system with training text. This paper presents a framework for the automatic generation of training data, provided only with labeled character images and a digital font, thus removing the need for manually generated text...
Activity forecasting has recently become an active research area for its importance in critical applications like automated navigation and human-computer interaction. However, for a video observed upto a certain time, all of the existing forecasting works focus on predicting the activity label, i.e., predicting what the next unobserved activity is. To the best of our knowledge, no work has answered...
We address the difficult problem of distinguishing fine-grained object categories in low resolution images. We propose a simple an effective deep learning approach that transfers fine-grained knowledge gained from high resolution training data to the coarse low-resolution test scenario. Such fine-to-coarse knowledge transfer has many real world applications, such as identifying objects in surveillance...
Data augmentation is the process of generating samples by transforming training data, with the target of improving the accuracy and robustness of classifiers. In this paper, we propose a new automatic and adaptive algorithm for choosing the transformations of the samples used in data augmentation. Specifically, for each sample, our main idea is to seek a small transformation that yields maximal classification...
Deep Convolutional Neural Networks (CNN) have recently been shown to outperform previous state of the art approaches for image classification. Their success must in parts be attributed to the availability of large labeled training sets such as provided by the ImageNet benchmarking initiative. When training data is scarce, however, CNNs have proven to fail to learn descriptive features. Recent research...
Plankton image classification plays an important role in the ocean ecosystems research. Recently, a large scale database for plankton classification with over 3 million images annotated with over 100 classes was released. However, the database suffers from imbalanced class distribution in which over 90% of images belong to only 5 classes. Due to this class-imbalance problem, the existing classification...
In the last years, the increasing availability of annotated data has facilitated the great success of supervised learning in real-world applications such as semantic labeling. However, the vast majority of data is nowadays unlabeled or partially annotated. In this paper, we develop an Expected Marginal Latent Structural SVM (EM-LSSVM) framework for performing structured learning in the presence of...
Many species in the wild exhibit a visual pattern that can be used to uniquely identify an individual. This observation has recently led to visual animal biometrics become a rapidly growing application area of computer vision. Customized software tools for animal biometrics already employ vision based techniques to recognize individuals in images taken in uncontrolled environments. However, most existing...
The 3D reconstruction is an essential step to measure the craniofacial morphological changes from the historical growth database with only 2D cephalograms. In this paper, we propose a novel regression-forest-based method to estimate the volumetric intensity images from a lateral cephalogram. The regression forest can produce a prediction of the volumetric craniofacial structure as a mixture of Gaussian...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.