Search results

chapter

Video2vec: Learning semantic spatio-temporal embeddings for video representation

Sheng-Hung Hu, Yikang Li, Baoxin Li

2016 23rd International Conference on Pattern Recognition (ICPR) > 811 - 816

2016 23rd International Conference on Pattern Recognition (ICPR)

We propose to learn semantic spatio-temporal embeddings for videos to support high-level video analysis. The first step of the proposed embedding employs a deep architecture consisting of two channels of convolutional neural networks (capturing appearance and local motion) followed by their corresponding Gated Recurrent Unit encoders for capturing longer-term temporal structure of the CNN features...

chapter

Effective surface normals based action recognition in depth images

Xuan Son Nguyen, Thanh Phuong Nguyen, Francois Charpillet

2016 23rd International Conference on Pattern Recognition (ICPR) > 817 - 822

2016 23rd International Conference on Pattern Recognition (ICPR)

In this paper, we propose a new local descriptor for action recognition in depth images. The proposed descriptor relies on surface normals in 4D space of depth, time, spatial coordinates and higher-order partial derivatives of depth values along spatial coordinates. In order to classify actions, we follow the traditional Bag-of-words (BoW) approach, and propose two encoding methods termed Multi-Scale...

chapter

Face anti-spoofing with multifeature videolet aggregation

Talha Ahmad Siddiqui, Samarth Bharadwaj, Tejas I. Dhamecha, Akshay Agarwal, more

2016 23rd International Conference on Pattern Recognition (ICPR) > 1035 - 1040

2016 23rd International Conference on Pattern Recognition (ICPR)

Biometric systems can be attacked in several ways and the most common being spoofing the input sensor. Therefore, anti-spoofing is one of the most essential prerequisite against attacks on biometric systems. For face recognition it is even more vulnerable as the image capture is non-contact based. Several anti-spoofing methods have been proposed in the literature for both contact and non-contact based...

chapter

A deep multi-level network for saliency prediction

Marcella Cornia, Lorenzo Baraldi, Giuseppe Serra, Rita Cucchiara

2016 23rd International Conference on Pattern Recognition (ICPR) > 3488 - 3493

2016 23rd International Conference on Pattern Recognition (ICPR)

This paper presents a novel deep architecture for saliency prediction. Current state of the art models for saliency prediction employ Fully Convolutional networks that perform a non-linear combination of features extracted from the last convolutional layer to predict saliency maps. We propose an architecture which, instead, combines features extracted at different levels of a Convolutional Neural...

chapter

Invariant hierarchical sparse coding for object recognition via bags of atoms

Xiaoxia Sun, Nasser M. Nasrabadi, Trac D. Tran

2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP) > 212 - 216

2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP)

In this paper, we introduce a novel local feature-based hierarchical framework to produce invariant sparse codes for object recognition. In order to enforce the invariant property for each sample patch (local feature descriptor) in the image, its sparse code is recovered with a dedicated dictionary whose atoms are adaptively chosen from several bags of candidate atoms. The single-layer invariant sparse...

chapter

Boosting VLAD with double assignment using deep features for action recognition in videos

Ionut C. Duta, Tuan A. Nguyen, Kiyoharu Aizawa, Bogdan Ionescu, more

2016 23rd International Conference on Pattern Recognition (ICPR) > 2210 - 2215

2016 23rd International Conference on Pattern Recognition (ICPR)

The encoding method is an important factor for an action recognition pipeline. One of the key points for the encoding method is the assignment step. A very widely used super-vector encoding method is the vector of locally aggregated descriptors (VLAD), with very competitive results in many tasks. However, it considers only hard assignment and the criteria for the assignment is performed only from...

chapter

One-shot learning of temporal sequences using a distance dependent Chinese Restaurant Process

Carlos Orrite, Mario Rodriguez, Carlos Medrano

2016 23rd International Conference on Pattern Recognition (ICPR) > 2694 - 2699

2016 23rd International Conference on Pattern Recognition (ICPR)

Activity recognition in videos is a challenging task, mainly if a scarce number of samples is available for modelling the problem. The task becomes even harder when using generative models such as mixture models or Hidden Markov Models (HMMs), as they demand a lot of samples to determinate their parameters. Additionally, these models rely on the appropriate selection of some parameters, for instance...

chapter

Efficient video face recognition by using Fisher Vector encoding of binary features

Yoanna Martinez-Diaz, Leonardo Chang, Noslen Hernandez, Heydi Mendez-Vazquez, more

2016 23rd International Conference on Pattern Recognition (ICPR) > 1436 - 1441

2016 23rd International Conference on Pattern Recognition (ICPR)

One of the main problems of recognizing faces in videos is to achieve accurate algorithms which can be used in real-time applications. Recently, Fisher Vector representation of local descriptors (e.g., SIFT) has gained widespread popularity, achieving good recognition rates. In this work, we propose to use Fisher Vector encoding of binary features for video face recognition, in order to speed up the...

chapter

Multilingual articulatory features augmentation learning

Yue Zhao, Rui Zhao, Xiaoyang Wang, Qiang Ji

2016 23rd International Conference on Pattern Recognition (ICPR) > 2895 - 2899

2016 23rd International Conference on Pattern Recognition (ICPR)

Articulatory features are used as an universal set of speech attributes shared across many different languages. Some multilingual and cross-language speech recognition systems using articulatory features have been shown to improve the performance. The existing articulatory features are defined by phonetician as a set of articulatory descriptions of phones, which represent some semantic information...

chapter

Automatic video description generation via LSTM with joint two-stream encoding

Chenyang Zhang, Yingli Tian

2016 23rd International Conference on Pattern Recognition (ICPR) > 2924 - 2929

2016 23rd International Conference on Pattern Recognition (ICPR)

In this paper, we propose a novel two-stream framework based on combinational deep neural networks. The framework is mainly composed of two components: one is a parallel two-stream encoding component which learns video encoding from multiple sources using 3D convolutional neural networks and the other is a long-short-term-memory (LSTM)-based decoding language model which transfers the input encoded...

chapter

Two-dimensional PCA hashing and its extension

Minqi Mao, Zhonglong Zheng, Zhongyu Chen, Huawen Liu, more

2016 23rd International Conference on Pattern Recognition (ICPR) > 1624 - 1629

2016 23rd International Conference on Pattern Recognition (ICPR)

Recently, hash algorithms catch amounts of sights in the field of machine learning. Most existing hash methods directly utilize a vector, which can be piped by the column of image matrix, as a unit and adopt some feature extraction functions to project the original data into generally shorter fixed-length values or characters. Then each of these projected real values is quantized or hashed into zero-one...

chapter

Local multiple directional pattern of palmprint image

Lunke Fei, Jie Wen, Zheng Zhang, Ke Yan, more

2016 23rd International Conference on Pattern Recognition (ICPR) > 3013 - 3018

2016 23rd International Conference on Pattern Recognition (ICPR)

Lines are the most essential and discriminative features of palmprint images, which motivate researches to propose various line direction based methods for palmprint recognition. Conventional methods usually capture the only one of the most dominant direction of palmprint images. However, a number of points in palmprint images have double or even more than two dominant directions because of a plenty...

chapter

Bag of Embedded Words learning for text retrieval

Nikolaos Passalis, Anastasios Tefas

2016 23rd International Conference on Pattern Recognition (ICPR) > 2416 - 2421

2016 23rd International Conference on Pattern Recognition (ICPR)

The word embedding models are capable of capturing the semantic content of the textual words. The process of extracting a set of word embedding vectors from a text document is similar to the feature extraction step of the Bag-of-Features pipeline, which is usually used in computer vision tasks. That gives rise to the Bag-of-Embedded Words (BoEW) model. In this paper a novel learning technique that...

chapter

Mutli-channel micro-structure difference descriptor for image retrieval

Xuekuan Wang, Cairong Zhao, Duoqian Miao, Cuijun Liu, more

2016 23rd International Conference on Pattern Recognition (ICPR) > 2930 - 2935

2016 23rd International Conference on Pattern Recognition (ICPR)

This paper presents a novel image feature representation method, called multi-channel micro-structure difference descriptor (MCMSDD) for image retrieval. With the local feature extraction from a micro-structure and MAX operator, MCMSDD integrates the advantages of multi-channel local binary encoding and color difference histogram , which are the fusion of color, texture and spatial distribution information...

chapter

VLAD encoded Deep Convolutional features for unconstrained face verification

Jingxiao Zheng, Jun-Cheng Chen, Navaneeth Bodla, Vishal M. Patel, more

2016 23rd International Conference on Pattern Recognition (ICPR) > 4101 - 4106

2016 23rd International Conference on Pattern Recognition (ICPR)

We present a method for combining the Vector of Locally Aggregated Descriptor (VLAD) feature encoding with Deep Convolutional Neural Network (DCNN) features for unconstrained face verification. One of the key features of our method, called the VLAD-encoded DCNN (VLAD-DCNN) features, is that spatial and appearance information are simultaneously processed to learn an improved discriminative representation...

chapter

Analysis of teamwork dialogue: A data mining approach

Antonette Shibani, Elizabeth Koh, Vivian Lai, Kyong Jin Shim

2016 IEEE International Conference on Big Data (Big Data) > 4032 - 4034

2016 IEEE International Conference on Big Data (Big Data)

With the advent of the Internet and wide-spread popularity of online technology-enhanced learning platforms, many pedagogical activities today involve learners in online discussions such as synchronous chat. In this study, we describe a text mining method used for analyzing teamwork from such chat dialogue of students. The steps in the text mining method such as pre-processing and classification are...

chapter

A novel feature extraction scheme for N400 detection

Reshma Kar, Amit Konar, Aruna Chakraborty, Sanchita Ghosh

2016 IEEE Annual India Conference (INDICON) > 1 - 5

2016 IEEE Annual India Conference (INDICON)

The presented work proposes a simple feature extraction technique which is designed for robust detection of event related potentials (ERP). This technique was tested to detect the N400 which is an ERP generally associated with recall. The chief advantages of the proposed technique are that it is robust to different ocular artifacts and yet sensitive to event related potentials. Further each signal...

chapter

Structure Selection for Convolutive Non-negative Matrix Factorization Using Normalized Maximum Likelihood Coding

Atsushi Suzuki, Kohei Miyaguchi, Kenji Yamanishi

2016 IEEE 16th International Conference on Data Mining (ICDM) > 1221 - 1226

2016 IEEE 16th International Conference on Data Mining (ICDM)

Convolutive non-negative matrix factorization (CNMF) is a promising method for extracting features from sequential multivariate data. Conventional algorithms for CNMF require that the structure, or the number of bases for expressing the data, be specified in advance. We are concerned with the issue of how we can select the best structure of CNMF from given data. We first introduce a framework of probabilistic...

chapter

Author Identification Using Deep Learning

Ahmed M. Mohsen, Nagwa M. El-Makky, Nagia Ghanem

2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA) > 898 - 903

2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)

Authorship identification is the task of identifying the author of a given text from a set of suspects. The main concern of this task is to define an appropriate characterization of texts that captures the writing style of authors. Although deep learning was recently used in different natural language processing tasks, it has not been used in author identification (to the best of our knowledge). In...

chapter

Improving Speed Independent Performance of Fault Diagnosis Systems through Feature Mapping and Normalization

Aparna S. Raghunath, K. T. Sreekumar, C. Santhosh Kumar, K. I. Ramachandran

2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA) > 764 - 767

2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)

High accuracy fault diagnosis systems are extremely important for effective condition based maintenance (CBM) of rotating machines. In this work, we develop a fault diagnosis system using time and frequency domain statistical features as input to a backend support vector machine (SVM) classifier. We evaluate the performance of the baseline system for speed dependent and speed independent performance...

INFONA - science communication portal

Search results

Video2vec: Learning semantic spatio-temporal embeddings for video representation

Effective surface normals based action recognition in depth images

Face anti-spoofing with multifeature videolet aggregation

A deep multi-level network for saliency prediction

Invariant hierarchical sparse coding for object recognition via bags of atoms

Boosting VLAD with double assignment using deep features for action recognition in videos

One-shot learning of temporal sequences using a distance dependent Chinese Restaurant Process

Efficient video face recognition by using Fisher Vector encoding of binary features

Multilingual articulatory features augmentation learning

Automatic video description generation via LSTM with joint two-stream encoding

Two-dimensional PCA hashing and its extension

Local multiple directional pattern of palmprint image

Bag of Embedded Words learning for text retrieval

Mutli-channel micro-structure difference descriptor for image retrieval

VLAD encoded Deep Convolutional features for unconstrained face verification

Analysis of teamwork dialogue: A data mining approach

A novel feature extraction scheme for N400 detection

Structure Selection for Convolutive Non-negative Matrix Factorization Using Normalized Maximum Likelihood Coding

Author Identification Using Deep Learning

Improving Speed Independent Performance of Fault Diagnosis Systems through Feature Mapping and Normalization

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options