The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The handwritten signature is perhaps the most accustomed way for the acknowledgement of the consent of an individual or the authentication of the identity of a person in numerous transactions. In addition, the authenticity of a questioned offline or static handwritten signature still poses a case of interest, especially in forensic related applications. A common approach in offline signature verification...
Generative models of 3D human motion are often restricted to a small number of activities and can therefore not generalize well to novel movements or applications. In this work we propose a deep learning framework for human motion capture data that learns a generic representation from a large corpus of motion capture data and generalizes well to new, unseen, motions. Using an encoding-decoding network...
Modeling of high order interactional context, e.g., group interaction, lies in the central of collective/group activity recognition. However, most of the previous activity recognition methods do not offer a flexible and scalable scheme to handle the high order context modeling problem. To explicitly address this fundamental bottleneck, we propose a recurrent interactional context modeling scheme based...
The ability to proactively monitor business processes is one of the main differentiators for firms to remain competitive. Process execution logs generated by Process Aware Information Systems (PAIS) help to make various business process specific predictions. This enables a proactive situational awareness related to the execution of business processes. The goal of the approach proposed in the current...
In this paper, we propose a new video representation incorporating image based deep features and an efficient pooling strategy for the purpose of action recognition. The Convolutional Neural Network (CNN) based features have very recently emerged as the new state of the art for image classification. Several attempts have been made to extend such CNN models for videos by explicitly focusing on the...
In this paper, we propose a new local descriptor for action recognition in depth images. The proposed descriptor relies on surface normals in 4D space of depth, time, spatial coordinates and higher-order partial derivatives of depth values along spatial coordinates. In order to classify actions, we follow the traditional Bag-of-words (BoW) approach, and propose two encoding methods termed Multi-Scale...
Activity recognition in videos is a challenging task, mainly if a scarce number of samples is available for modelling the problem. The task becomes even harder when using generative models such as mixture models or Hidden Markov Models (HMMs), as they demand a lot of samples to determinate their parameters. Additionally, these models rely on the appropriate selection of some parameters, for instance...
Several conventional methods have been implemented in pattern recognition, but few of them have biological plausibility. This paper mimics the hierarchical visual system and uses the precise-spike-driven (PSD) synaptic plasticity rule to learn. The well-known HMAX model imitates the visual cortex and uses Gabor filter and max pooling to extract features. Compared with the traditional HMAX model, our...
Action recognition has been one of the challenging problems in the computer vision community. Most of the recent research work in this area exploits the motion features captured by dense trajectory descriptors. On the other hand, static image classification has seen the rise of deep learning architectures, with evidence that the output of intermediate layers could be successfully employed as a low...
We present a method to combine the Fisher vector representation and the Deep Convolutional Neural Network (DCNN) features to generate a rerpesentation, called the Fisher vector encoded DCNN (FV-DCNN) features, for unconstrained face verification. One of the key features of our method is that spatial and appearance information are simultaneously processed when learning the Gaussian mixture model to...
Bag-of-words (BoW) modeling has yielded successful results in document and image classification tasks. In this paper, we explore the use of BoW for cognitive state classification. We estimate a set of common patterns embedded in the fMRI time series recorded in three dimensional voxel coordinates by clustering the BOLD responses. We use these common patterns, called the code-words, to encode activities...
In this paper, a new kind of Fisher Vector (FV) model, named Scale FV (ScaleFV), is proposed to ameliorate visual feature encoding for human action recognition. Although several researches have been proposed for feature encoding, the temporal scale information is almost ignored. Similar to the spatial scale information which has shown to be important in extracting and encoding visual features, the...
Sparse coding has been used for target appearance modeling and applied successfully in visual tracking. However, noise may be inevitably introduced into the representation due to background clutter. To cope with this problem, we propose a saliency weighted sparse coding appearance model for visual tracking. Firstly, a spectral filtering based visual attention computational model, which combines both...
This paper presents an SAR image classification approach that takes advantage of both amplitude and texture features. The proposed approach is based on superpixels obtained with some over-segmentation methods, and consists of two stages. In the first stage, the SAR image is classified with amplitude and texture feature used separately. Specifically, we use statistical model based maximum-likelihood...
We propose a preliminary investigation on the benefits and limitations of classifiers based on sparse representations. We specifically focus on the union of subspaces data model and examine binary classifiers built on a sparse non linear mapping (in a redundant dictionary) followed by a linear classifier. We study two common sparse non linear mappings (namely l0 and l1) and show that, in both cases,...
Detecting abnormal events in crowded scenes remains challenging due to the diversity of events defined by various applications. Among the many application situations, motion analysis for event representation is suited for crowded scenes. In this paper, we propose a novel abnormal event detection method via likelihood estimation of dynamic-texture motion representation, called Structural Multi-scale...
Image representations using code words from a visual dictionary are widely applied in object detection and categorization. Traditionally, there are two types of methods to construct a dictionary: k-means and optimization-based method. The former cannot achieve a good discriminability because it extracts too many background features. The latter needs to cooperate with coding methods and brings about...
This paper describes a novel approach for extraction of multilingual transliteration pairs from aligned parallel corpus. The proposed approach utilizes an encoding technique based on “Place and Manner of Articulation”. Jaccard Coefficient has been used to measure the distance between encoded source and target transliteration pairs. The proposed methodology has been employed for extraction of English-Bangla...
Semantic understanding of images remains an important research challenge for the image and video retrieval community. A novel natural scene retrieval method based on non-negative sparse coding is proposed in this paper. It firstly combines non-negative sparse coding with spatial pyramid matching for feature extraction and representation. Then, based on sparse coding, it ranks the Euclidean distances...
Saliency is an important factor in feature coding, based on which saliency coding (SaC) has been proposed for image classification recently. SaC is both effective and efficient in case of a moderate-scale codebook. However, empirical studies show that SaC will lose its superiority as the codebook size increases. To address this problem, we propose a group coding strategy, wherein the latent structure...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.