2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

chapter

Not Afraid of the Dark: NIR-VIS Face Recognition via Cross-Spectral Hallucination and Low-Rank Embedding

Jose Lezama, Qiang Qiu, Guillermo Sapiro

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6807 - 6816

Surveillance cameras today often capture NIR (near infrared) images in low-light environments. However, most face datasets accessible for training and verification are only collected in the VIS (visible light) spectrum. It remains a challenging problem to match NIR to VIS face images due to the different light spectrum. Recently, breakthroughs have been made for VIS face recognition by applying deep...

chapter

Learning Non-maximum Suppression

Jan Hosang, Rodrigo Benenson, Bernt Schiele

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6469 - 6477

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Object detectors have hugely profited from moving towards an end-to-end learning paradigm: proposals, fea tures, and the classifier becoming one neural network improved results two-fold on general object detection. One indispensable component is non-maximum suppression (NMS), a post-processing algorithm responsible for merging all detections that belong to the same object. The de facto standard NMS...

chapter

Deep Cross-Modal Hashing

Qing-Yuan Jiang, Wu-Jun Li

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3270 - 3278

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Due to its low storage cost and fast query speed, cross-modal hashing (CMH) has been widely used for similarity search in multimedia retrieval applications. However, most existing CMH methods are based on hand-crafted features which might not be optimally compatible with the hash-code learning procedure. As a result, existing CMH methods with hand-crafted features may not achieve satisfactory performance...

chapter

Temporal Action Localization by Structured Maximal Sums

Zehuan Yuan, Jonathan C. Stroud, Tong Lu, Jia Deng

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3215 - 3223

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We address the problem of temporal action localization in videos. We pose action localization as a structured prediction over arbitrary-length temporal windows, where each window is scored as the sum of frame-wise classification scores. Additionally, our model classifies the start, middle, and end of each action as separate components, allowing our system to explicitly model each actions temporal...

chapter

Kernel Pooling for Convolutional Neural Networks

Yin Cui, Feng Zhou, Jiang Wang, Xiao Liu, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3049 - 3058

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Convolutional Neural Networks (CNNs) with Bilinear Pooling, initially in their full form and later using compact representations, have yielded impressive performance gains on a wide range of visual tasks, including fine-grained visual categorization, visual question answering, face recognition, and description of texture and style. The key to their success lies in the spatially invariant modeling...

chapter

Unsupervised Video Summarization with Adversarial LSTM Networks

Behrooz Mahasseni, Michael Lam, Sinisa Todorovic

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2982 - 2991

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This paper addresses the problem of unsupervised video summarization, formulated as selecting a sparse subset of video frames that optimally represent the input video. Our key idea is to learn a deep summarizer network to minimize distance between training videos and a distribution of their summarizations, in an unsupervised way. Such a summarizer can then be applied on a new video for estimating...

chapter

StyleBank: An Explicit Representation for Neural Image Style Transfer

Dongdong Chen, Lu Yuan, Jing Liao, Nenghai Yu, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2770 - 2779

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We propose StyleBank, which is composed of multiple convolution filter banks and each filter bank explicitly represents one style, for neural image style transfer. To transfer an image to a specific style, the corresponding filter bank is operated on top of the intermediate feature embedding produced by a single auto-encoder. The StyleBank and the auto-encoder are jointly learnt, where the learning...

chapter

Image Super-Resolution via Deep Recursive Residual Network

Ying Tai, Jian Yang, Xiaoming Liu

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2790 - 2798

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Recently, Convolutional Neural Network (CNN) based models have achieved great success in Single Image Super-Resolution (SISR). Owing to the strength of deep networks, these CNN models learn an effective nonlinear mapping from the low-resolution input image to the high-resolution target image, at the cost of requiring enormous parameters. This paper proposes a very deep CNN model (up to 52 convolutional...

chapter

A Unified Approach of Multi-scale Deep and Hand-Crafted Features for Defocus Estimation

Jinsun Park, Yu-Wing Tai, Donghyeon Cho, In So Kweon

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2760 - 2769

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

In this paper, we introduce robust and synergetic hand-crafted features and a simple but efficient deep feature from a convolutional neural network (CNN) architecture for defocus estimation. This paper systematically analyzes the effectiveness of different features, and shows how each feature can compensate for the weaknesses of other features when they are concatenated. For a full defocus map estimation,...

chapter

Human Shape from Silhouettes Using Generative HKS Descriptors and Cross-Modal Neural Networks

Endri Dibra, Himanshu Jain, Cengiz Oztireli, Remo Ziegler, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5504 - 5514

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

In this work, we present a novel method for capturing human body shape from a single scaled silhouette. We combine deep correlated features capturing different 2D views, and embedding spaces based on 3D cues in a novel convolutional neural network (CNN) based architecture. We first train a CNN to find a richer body shape representation space from pose invariant 3D human shape descriptors. Then, we...

chapter

Full Resolution Image Compression with Recurrent Neural Networks

George Toderici, Damien Vincent, Nick Johnston, Sung Jin Hwang, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5435 - 5443

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This paper presents a set of full-resolution lossy image compression methods based on neural networks. Each of the architectures we describe can provide variable compression rates during deployment without requiring retraining of the network: each network need only be trained once. All of our architectures consist of a recurrent neural network (RNN)-based encoder and decoder, a binarizer, and a neural...

chapter

ER3: A Unified Framework for Event Retrieval, Recognition and Recounting

Zhanning Gao, Gang Hua, Dongqing Zhang, Nebojsa Jojic, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2107 - 2116

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We develop a unified framework for complex event retrieval, recognition and recounting. The framework is based on a compact video representation that exploits the temporal correlations in image features. Our feature alignment procedure identifies and removes the feature redundancies across frames and outputs an intermediate tensor representation we call video imprint. The video imprint is then fed...

chapter

Dual Attention Networks for Multimodal Reasoning and Matching

Hyeonseob Nam, Jung-Woo Ha, Jeonghee Kim

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2156 - 2164

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We propose Dual Attention Networks (DANs) which jointly leverage visual and textual attention mechanisms to capture fine-grained interplay between vision and language. DANs attend to specific regions in images and words in text through multiple steps and gather essential information from both modalities. Based on this framework, we introduce two types of DANs for multimodal reasoning and matching,...

chapter

Few-Shot Object Recognition from Machine-Labeled Web Images

Zhongwen Xu, Linchao Zhu, Yi Yang

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5358 - 5366

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

With the tremendous advances made by Convolutional Neural Networks (ConvNets) on object recognition, we can now easily obtain adequately reliable machine-labeled annotations easily from predictions by off-the-shelf ConvNets. In this work, we present an abstraction memory based framework for few-shot learning, building upon machine-labeled image annotations. Our method takes large-scale machine-annotated...

chapter

More is Less: A More Complicated Network with Less Inference Complexity

Xuanyi Dong, Junshi Huang, Yi Yang, Shuicheng Yan

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1895 - 1903

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

In this paper, we present a novel and general network structure towards accelerating the inference process of convolutional neural networks, which is more complicated in network structure yet with less inference complexity. The core idea is to equip each original convolutional layer with another low-cost collaborative layer (LCCL), and the element-wise multiplication of the ReLU outputs of these two...

chapter

Weakly Supervised Dense Video Captioning

Zhiqiang Shen, Jianguo Li, Zhou Su, Minjun Li, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5159 - 5167

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This paper focuses on a novel and challenging vision task, dense video captioning, which aims to automatically describe a video clip with multiple informative and diverse caption sentences. The proposed method is trained without explicit annotation of fine-grained sentence to video region-sequence correspondence, but is only based on weak video-level sentence annotations. It differs from existing...

chapter

Person Search with Natural Language Description

Shuang Li, Tong Xiao, Hongsheng Li, Bolei Zhou, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5187 - 5196

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Searching persons in large-scale image databases with the query of natural language description has important applications in video surveillance. Existing methods mainly focused on searching persons with image-based or attribute-based queries, which have major limitations for a practical usage. In this paper, we study the problem of person search with natural language description. Given the textual...

chapter

Oriented Response Networks

Yanzhao Zhou, Qixiang Ye, Qiang Qiu, Jianbin Jiao

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4961 - 4970

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Deep Convolution Neural Networks (DCNNs) are capable of learning unprecedentedly effective image representations. However, their ability in handling significant local and global image rotations remains limited. In this paper, we propose Active Rotating Filters (ARFs) that actively rotate during convolution and produce feature maps with location and orientation explicitly encoded. An ARF acts as a...

chapter

End-to-End 3D Face Reconstruction with Deep Neural Networks

Pengfei Dou, Shishir K. Shah, Ioannis A. Kakadiaris

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1503 - 1512

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Monocular 3D facial shape reconstruction from a single 2D facial image has been an active research area due to its wide applications. Inspired by the success of deep neural networks (DNN), we propose a DNN-based approach for End-to-End 3D FAce Reconstruction (UH-E2FAR) from a single 2D image. Different from recent works that reconstruct and refine the 3D face in an iterative manner using both an RGB...

chapter

All You Need is Beyond a Good Init: Exploring Better Solution for Training Extremely Deep Convolutional Neural Networks with Orthonormality and Modulation

Di Xie, Jiang Xiong, Shiliang Pu

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5075 - 5084

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Deep neural network is difficult to train and this predicament becomes worse as the depth increases. The essence of this problem exists in the magnitude of backpropagated errors that will result in gradient vanishing or exploding phenomenon. We show that a variant of regularizer which utilizes orthonormality among different filter banks can alleviate this problem. Moreover, we design a backward error...

INFONA - science communication portal

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Not Afraid of the Dark: NIR-VIS Face Recognition via Cross-Spectral Hallucination and Low-Rank Embedding

Learning Non-maximum Suppression

Deep Cross-Modal Hashing

Temporal Action Localization by Structured Maximal Sums

Kernel Pooling for Convolutional Neural Networks

Unsupervised Video Summarization with Adversarial LSTM Networks

StyleBank: An Explicit Representation for Neural Image Style Transfer

Image Super-Resolution via Deep Recursive Residual Network

A Unified Approach of Multi-scale Deep and Hand-Crafted Features for Defocus Estimation

Human Shape from Silhouettes Using Generative HKS Descriptors and Cross-Modal Neural Networks

Full Resolution Image Compression with Recurrent Neural Networks

ER3: A Unified Framework for Event Retrieval, Recognition and Recounting

Dual Attention Networks for Multimodal Reasoning and Matching

Few-Shot Object Recognition from Machine-Labeled Web Images

More is Less: A More Complicated Network with Less Inference Complexity

Weakly Supervised Dense Video Captioning

Person Search with Natural Language Description

Oriented Response Networks

End-to-End 3D Face Reconstruction with Deep Neural Networks

All You Need is Beyond a Good Init: Exploring Better Solution for Training Extremely Deep Convolutional Neural Networks with Orthonormality and Modulation

Filter options

Publication date

Keywords

INFONA - science communication portal

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)