2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

chapter

Multi-way Multi-level Kernel Modeling for Neuroimaging Classification

Lifang He, Chun-Ta Lu, Hao Ding, Shen Wang, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6846 - 6854

Owing to prominence as a diagnostic tool for probing the neural correlates of cognition, neuroimaging tensor data has been the focus of intense investigation. Although many supervised tensor learning approaches have been proposed, they either cannot capture the nonlinear relationships of tensor data or cannot preserve the complex multi-way structural information. In this paper, we propose a Multi-way...

chapter

Not Afraid of the Dark: NIR-VIS Face Recognition via Cross-Spectral Hallucination and Low-Rank Embedding

Jose Lezama, Qiang Qiu, Guillermo Sapiro

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6807 - 6816

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Surveillance cameras today often capture NIR (near infrared) images in low-light environments. However, most face datasets accessible for training and verification are only collected in the VIS (visible light) spectrum. It remains a challenging problem to match NIR to VIS face images due to the different light spectrum. Recently, breakthroughs have been made for VIS face recognition by applying deep...

chapter

Multi-attention Network for One Shot Learning

Peng Wang, Lingqiao Liu, Chunhua Shen, Zi Huang, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6212 - 6220

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

One-shot learning is a challenging problem where the aim is to recognize a class identified by a single training image. Given the practical importance of one-shot learning, it seems surprising that the rich information present in the class tag itself has largely been ignored. Most existing approaches restrict the use of the class tag to finding similar classes and transferring classifiers or metrics...

chapter

Temporal Action Localization by Structured Maximal Sums

Zehuan Yuan, Jonathan C. Stroud, Tong Lu, Jia Deng

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3215 - 3223

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We address the problem of temporal action localization in videos. We pose action localization as a structured prediction over arbitrary-length temporal windows, where each window is scored as the sum of frame-wise classification scores. Additionally, our model classifies the start, middle, and end of each action as separate components, allowing our system to explicitly model each actions temporal...

chapter

Supervising Neural Attention Models for Video Captioning by Human Gaze Data

Youngjae Yu, Jongwook Choi, Yeonhwa Kim, Kyung Yoo, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6119 - 6127

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

The attention mechanisms in deep neural networks are inspired by humans attention that sequentially focuses on the most relevant parts of the information over time to generate prediction output. The attention parameters in those models are implicitly trained in an end-to-end manner, yet there have been few trials to explicitly incorporate human gaze tracking to supervise the attention models. In this...

chapter

Visual Translation Embedding Network for Visual Relation Detection

Hanwang Zhang, Zawlin Kyaw, Shih-Fu Chang, Tat-Seng Chua

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3107 - 3115

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Visual relations, such as person ride bike and bike next to car, offer a comprehensive scene understanding of an image, and have already shown their great utility in connecting computer vision and natural language. However, due to the challenging combinatorial complexity of modeling subject-predicate-object relation triplets, very little work has been done to localize and predict visual relations...

chapter

A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection

Xiaolong Wang, Abhinav Shrivastava, Abhinav Gupta

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3039 - 3048

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

How do we learn an object detector that is invariant to occlusions and deformations? Our current solution is to use a data-driven strategy – collect large-scale datasets which have object instances under different conditions. The hope is that the final classifier can use these examples to learn invariances. But is it really possible to see all the occlusions in a dataset? We argue that...

chapter

Unsupervised Video Summarization with Adversarial LSTM Networks

Behrooz Mahasseni, Michael Lam, Sinisa Todorovic

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2982 - 2991

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This paper addresses the problem of unsupervised video summarization, formulated as selecting a sparse subset of video frames that optimally represent the input video. Our key idea is to learn a deep summarizer network to minimize distance between training videos and a distribution of their summarizations, in an unsupervised way. Such a summarizer can then be applied on a new video for estimating...

chapter

Deep TEN: Texture Encoding Network

Hang Zhang, Jia Xue, Kristin Dana

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2896 - 2905

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We propose a Deep Texture Encoding Network (Deep-TEN) with a novel Encoding Layer integrated on top of convolutional layers, which ports the entire dictionary learning and encoding pipeline into a single model. Current methods build from distinct components, using standard encoders with separate off-the-shelf features such as SIFT descriptors or pre-trained CNN features for material recognition. Our...

chapter

Attend in Groups: A Weakly-Supervised Deep Learning Framework for Learning from Web Data

Bohan Zhuang, Lingqiao Liu, Yao Li, Chunhua Shen, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2915 - 2924

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Large-scale datasets have driven the rapid development of deep neural networks for visual recognition. However, annotating a massive dataset is expensive and time-consuming. Web images and their labels are, in comparison, much easier to obtain, but direct training on such automatially harvested images can lead to unsatisfactory performance, because the noisy labels of Web images adversely affect the...

chapter

Hierarchical Multimodal Metric Learning for Multimodal Classification

Heng Zhang, Vishal M. Patel, Rama Chellappa

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2925 - 2933

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Multimodal classification arises in many computer vision tasks such as object classification and image retrieval. The idea is to utilize multiple sources (modalities) measuring the same instance to improve the overall performance compared to using a single source (modality). The varying characteristics exhibited by multiple modalities make it necessary to simultaneously learn the corresponding metrics...

chapter

A Unified Approach of Multi-scale Deep and Hand-Crafted Features for Defocus Estimation

Jinsun Park, Yu-Wing Tai, Donghyeon Cho, In So Kweon

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2760 - 2769

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

In this paper, we introduce robust and synergetic hand-crafted features and a simple but efficient deep feature from a convolutional neural network (CNN) architecture for defocus estimation. This paper systematically analyzes the effectiveness of different features, and shows how each feature can compensate for the weaknesses of other features when they are concatenated. For a full defocus map estimation,...

chapter

ER3: A Unified Framework for Event Retrieval, Recognition and Recounting

Zhanning Gao, Gang Hua, Dongqing Zhang, Nebojsa Jojic, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2107 - 2116

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We develop a unified framework for complex event retrieval, recognition and recounting. The framework is based on a compact video representation that exploits the temporal correlations in image features. Our feature alignment procedure identifies and removes the feature redundancies across frames and outputs an intermediate tensor representation we call video imprint. The video imprint is then fed...

chapter

Perceptual Generative Adversarial Networks for Small Object Detection

Jianan Li, Xiaodan Liang, Yunchao Wei, Tingfa Xu, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1951 - 1959

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Detecting small objects is notoriously challenging due to their low resolution and noisy representation. Existing object detection pipelines usually detect small objects through learning representations of all the objects at multiple scales. However, the performance gain of such ad hoc architectures is usually limited to pay off the computational cost. In this work, we address the small object detection...

chapter

Neural Aggregation Network for Video Face Recognition

Jiaolong Yang, Peiran Ren, Dongqing Zhang, Dong Chen, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5216 - 5225

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This paper presents a Neural Aggregation Network (NAN) for video face recognition. The network takes a face video or face image set of a person with a variable number of face images as its input, and produces a compact, fixed-dimension feature representation for recognition. The whole network is composed of two modules. The feature embedding module is a deep Convolutional Neural Network (CNN) which...

chapter

CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos

Zheng Shou, Jonathan Chan, Alireza Zareian, Kazuyuki Miyazawa, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1417 - 1426

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Temporal action localization is an important yet challenging problem. Given a long, untrimmed video consisting of multiple action instances and complex background contents, we need not only to recognize their action categories, but also to localize the start time and end time of each instance. Many state-of-the-art systems use segment-level classifiers to select and rank proposal segments of pre-determined...

chapter

Deep Level Sets for Salient Object Detection

Ping Hu, Bing Shuai, Jun Liu, Gang Wang

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 540 - 549

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Deep learning has been applied to saliency detection in recent years. The superior performance has proved that deep networks can model the semantic properties of salient objects. Yet it is difficult for a deep network to discriminate pixels belonging to similar receptive fields around the object boundaries, thus deep networks may output maps with blurred saliency and inaccurate boundaries. To tackle...

chapter

Learning and Refining of Privileged Information-Based RNNs for Action Recognition from Depth Sequences

Zhiyuan Shi, Tae-Kyun Kim

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4684 - 4693

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Existing RNN-based approaches for action recognition from depth sequences require either skeleton joints or hand-crafted depth features as inputs. An end-to-end manner, mapping from raw depth maps to action classes, is non-trivial to design due to the fact that: 1) single channel map lacks texture thus weakens the discriminative power, 2) relatively small set of depth training data. To address these...

chapter

Quality Aware Network for Set to Set Recognition

Yu Liu, Junjie Yan, Wanli Ouyang

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4694 - 4703

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This paper targets on the problem of set to set recognition, which learns the metric between two image sets. Images in each set belong to the same identity. Since images in a set can be complementary, they hopefully lead to higher accuracy in practical applications. However, the quality of each sample cannot be guaranteed, and samples with poor quality will hurt the metric. In this paper, the quality...

chapter

Diversified Texture Synthesis with Feed-Forward Networks

Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 266 - 274

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Recent progresses on deep discriminative and generative modeling have shown promising results on texture synthesis. However, existing feed-forward based methods trade off generality for efficiency, which suffer from many issues, such as shortage of generality (i.e., build one network per texture), lack of diversity (i.e., always produce visually identical output) and suboptimality (i.e., generate...

INFONA - science communication portal

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Multi-way Multi-level Kernel Modeling for Neuroimaging Classification

Not Afraid of the Dark: NIR-VIS Face Recognition via Cross-Spectral Hallucination and Low-Rank Embedding

Multi-attention Network for One Shot Learning

Temporal Action Localization by Structured Maximal Sums

Supervising Neural Attention Models for Video Captioning by Human Gaze Data

Visual Translation Embedding Network for Visual Relation Detection

A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection

Unsupervised Video Summarization with Adversarial LSTM Networks

Deep TEN: Texture Encoding Network

Attend in Groups: A Weakly-Supervised Deep Learning Framework for Learning from Web Data

Hierarchical Multimodal Metric Learning for Multimodal Classification

A Unified Approach of Multi-scale Deep and Hand-Crafted Features for Defocus Estimation

ER3: A Unified Framework for Event Retrieval, Recognition and Recounting

Perceptual Generative Adversarial Networks for Small Object Detection

Neural Aggregation Network for Video Face Recognition

CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos

Deep Level Sets for Salient Object Detection

Learning and Refining of Privileged Information-Based RNNs for Action Recognition from Depth Sequences

Quality Aware Network for Set to Set Recognition

Diversified Texture Synthesis with Feed-Forward Networks

Filter options

Publication date

Keywords

INFONA - science communication portal

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)