The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper presents a novel local posture orientation-context descriptor, and proposes a FDDL(Fisher discriminant dictionary learning) method based on local orientation-preserving(LOP-FDDL) for sparse coding in action recognition task. To take full use of the information about the position of the local body-part related to the center of the torso, ant the spatial-temporal shape changes of the human...
Pedestrian detection is an important topic in many applications, such as intelligent transportation systems (ITSs) or surveillance. For the purpose of applications used around the clock, the work for detecting pedestrian based on thermal sensors has attracted significant attention. To achieve this, this paper proposes a LBP (local binary pattern) encoded multi-level classifier for detecting pedestrians...
The word embedding models are capable of capturing the semantic content of the textual words. The process of extracting a set of word embedding vectors from a text document is similar to the feature extraction step of the Bag-of-Features pipeline, which is usually used in computer vision tasks. That gives rise to the Bag-of-Embedded Words (BoEW) model. In this paper a novel learning technique that...
This paper investigates the robustness of two state-of-theart action recognition algorithms: a pixel domain approach based on 3D convolutional neural networks (C3D) and a compressed domain approach requiring only partial decoding of the video, based on feature description using motion vectors and Fisher vector encoding (MV-FV). We study the robustness of the two algorithms against: (i) quality variations,...
Sparse Coding is a widely used method to represent an image. However, sparse coding and its improved algorithms have the problem of complex computation and long running time and so on. For these problems, we propose an image classification method based on hash codes and space pyramid, which encodes local feature points with hash codes instead of sparse coding. Firstly, extract the local feature points...
Often, videos are composed of multiple concepts or even genres. For instance, news videos may contain sports, action, nature, etc. Therefore, encoding the distribution of such concepts/genres in a compact and effective representation is a challenging task. In this sense, we propose the Bag of Genres representation, which is based on a visual dictionary defined by a genre classifier. Each visual word...
In this paper, we present a novel scheme for text-independent online writer identification. As a first contribution, we propose histogram based features, inspired from the area of object detection, to describe the structural primitives of handwriting. Secondly, we have used sparse coding techniques to learn prototypes, that describe the general writing characteristics of the authors. To the best of...
In this paper, we propose a new texture descriptor, completed local derivative pattern (CLDP). In contrast to completed local binary pattern (CLBP), which involves only local differences at each scale, CLDP encodes the directional variation of the local differences of two scales as a complementary component to local patterns in CLBP. The new component in CLDP, with regarded as the directional derivative...
Scene recognition aims to find a semantic explanation of a scene, i.e., it helps intelligent machines to know where they are. It can be widely applied into various tasks in computer vision and robotics. Most of pioneer methods extracted a set of low-level features and put them into classifier directly to identify scene category. But it has been proved that low-level features do not work well. Currently...
Feature encoding is a crucial step in BOW image representation. The standard BOW model assigns each image feature to the nearest visual-word without making a distinction between the features that are assigned to the same words. This hard feature assignment leads to high quantization errors and degrades the learning capacity of the classifiers in image classification. We propose a fuzzy feature encoding...
Bag-of-words (BoW) modeling has yielded successful results in document and image classification tasks. In this paper, we explore the use of BoW for cognitive state classification. We estimate a set of common patterns embedded in the fMRI time series recorded in three dimensional voxel coordinates by clustering the BOLD responses. We use these common patterns, called the code-words, to encode activities...
Scene recognition has a wide range of applications, such as object recognition and detection, content-based image indexing and retrieval, and intelligent vehicle and robot navigation. In particular, natural scene images tend to be very complex and are difficult to analyze due to changes of illumination and transformation. In this study, we investigate a novel model to learn and recognize scenes in...
With the rising of intelligent vehicle technologies, traffic sign recognition become an essential problem in computer vision. Focusing on the traffic sign recognition under real-world scenario, this paper aims to develop novel local feature representation to improve the traffic sign recognition performance. Especially, with the local histogram feature as a basic unit, a novel histogram intersection...
Depression and other mood disorders are common, disabling disorders with a profound impact on individuals and families. Inspite of its high prevalence, it is easily missed during the early stages. Automatic depression analysis has become a very active field of research in the affective computing community in the past few years. This paper presents a framework for depression analysis based on unimodal...
The bag-of-words (BOW) has become a popular image representation model with successful implementations in visual analysis. Although the original model has been improved in several ways, the utilization of the Fuzzy Set Theory in BOW has not been investigated thoroughly. This paper presents a fuzzy feature encoding approach to address the problems associated with the hard and soft assignments of image...
The bag-of-features based models are widely used for image classification. In these models, an image is represented as a set of visual words which come from a dictionary. Therefore, a well learned dictionary is responsible for the discriminative power of representations of images. Our observations show that the representation of an image carries rich underlying information of a dictionary, so we propose...
I frame or I slice which adopts the intra prediction is a key part of video coding. Intra prediction is important because the prediction accuracy directly affects the efficiency of the following transformation, quantization and entropy coding. According to the case of intra mode selection in HEVC, it is necessary to improve the prediction for the two most frequently used modes, DC and PLANAR. A significant...
This paper presents an effective feature representation method in the context of activity recognition. Efficient and effective feature representation plays a crucial role not only in activity recognition, but also in a wide range of applications such as motion analysis, tracking, 3D scene understanding etc. In the context of activity recognition, local features are increasingly popular for representing...
We propose an action classification algorithm which uses Locality-constrained Linear Coding (LLC) to capture discriminative information of human body variations in each spatio-temporal subsequence of a video sequence. Our proposed method divides the input video into equally spaced overlapping spatio-temporal sub sequences, each of which is decomposed into blocks and then cells. We use the Histogram...
Image and video content analysis applications typically require functionalities such as object classification, detection and tracking, and activity recognition. Objects may undergo translation, rotation, and changes in scale due to perspective projection. Further, the appearance of objects and illumination conditions may change over time. Occasionally objects might also occlude one another in the...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.