2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

book

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

IEEE

chapter

Learning to Learn from Noisy Web Videos

Serena Yeung, Vignesh Ramanathan, Olga Russakovsky, Liyue Shen, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 7455 - 7463

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Understanding the simultaneously very diverse and intricately fine-grained set of possible human actions is a critical open problem in computer vision. Manually labeling training videos is feasible for some action classes but doesnt scale to the full long-tailed distribution of actions. A promising way to address this is to leverage noisy data from web queries to learn new actions, using semi-supervised...

chapter

Spatiotemporal Multiplier Networks for Video Action Recognition

Christoph Feichtenhofer, Axel Pinz, Richard P. Wildes

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 7445 - 7454

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This paper presents a general ConvNet architecture for video action recognition based on multiplicative interactions of spacetime features. Our model combines the appearance and motion pathways of a two-stream architecture by motion gating and is trained end-to-end. We theoretically motivate multiplicative gating functions for residual networks and empirically study their effect on classification...

chapter

[Publisher's information]

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 7506

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Provides a listing of current committee members and society officers.

chapter

Kernel Square-Loss Exemplar Machines for Image Retrieval

Rafael S. Rezende, Joaquin Zepeda, Jean Ponce, Francis Bach, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 7263 - 7271

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Zepeda and Pérez [41] have recently demonstrated the promise of the exemplar SVM (ESVM) as a feature encoder for image retrieval. This paper extends this approach in several directions: We first show that replacing the hinge loss by the square loss in the ESVM cost function significantly reduces encoding time with negligible effect on accuracy. We call this model square-loss exemplar machine,...

chapter

Multi-way Multi-level Kernel Modeling for Neuroimaging Classification

Lifang He, Chun-Ta Lu, Hao Ding, Shen Wang, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6846 - 6854

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Owing to prominence as a diagnostic tool for probing the neural correlates of cognition, neuroimaging tensor data has been the focus of intense investigation. Although many supervised tensor learning approaches have been proposed, they either cannot capture the nonlinear relationships of tensor data or cannot preserve the complex multi-way structural information. In this paper, we propose a Multi-way...

chapter

Not Afraid of the Dark: NIR-VIS Face Recognition via Cross-Spectral Hallucination and Low-Rank Embedding

Jose Lezama, Qiang Qiu, Guillermo Sapiro

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6807 - 6816

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Surveillance cameras today often capture NIR (near infrared) images in low-light environments. However, most face datasets accessible for training and verification are only collected in the VIS (visible light) spectrum. It remains a challenging problem to match NIR to VIS face images due to the different light spectrum. Recently, breakthroughs have been made for VIS face recognition by applying deep...

chapter

Exploiting Symmetry and/or Manhattan Properties for 3D Object Structure Estimation from Single and Multiple Images

Yuan Gao, Alan L. Yuille

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6718 - 6727

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Many man-made objects have intrinsic symmetries and Manhattan structure. By assuming an orthographic projection model, this paper addresses the estimation of 3D structures and camera projection using symmetry and/or Manhattan structure cues, which occur when the input is single-or multiple-image from the same category, e.g., multiple different cars. Specifically, analysis on the single image case...

chapter

Analyzing Computer Vision Data — The Good, the Bad and the Ugly

Oliver Zendel, Katrin Honauer, Markus Murschitz, Martin Humenberger, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6670 - 6680

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

In recent years, a great number of datasets were published to train and evaluate computer vision (CV) algorithms. These valuable contributions helped to push CV solutions to a level where they can be used for safety-relevant applications, such as autonomous driving. However, major questions concerning quality and usefulness of test data for CV evaluation are still unanswered. Researchers and engineers...

chapter

3D Shape Segmentation with Projective Convolutional Networks

Evangelos Kalogerakis, Melinos Averkiou, Subhransu Maji, Siddhartha Chaudhuri

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6630 - 6639

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This paper introduces a deep architecture for segmenting 3D objects into their labeled semantic parts. Our architecture combines image-based Fully Convolutional Networks (FCNs) and surface-based Conditional Random Fields (CRFs) to yield coherent segmentations of 3D shapes. The image-based FCNs are used for efficient view-based reasoning about 3D object parts. Through a special projection layer, FCN...

chapter

Learning Non-maximum Suppression

Jan Hosang, Rodrigo Benenson, Bernt Schiele

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6469 - 6477

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Object detectors have hugely profited from moving towards an end-to-end learning paradigm: proposals, fea tures, and the classifier becoming one neural network improved results two-fold on general object detection. One indispensable component is non-maximum suppression (NMS), a post-processing algorithm responsible for merging all detections that belong to the same object. The de facto standard NMS...

chapter

Multi-attention Network for One Shot Learning

Peng Wang, Lingqiao Liu, Chunhua Shen, Zi Huang, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6212 - 6220

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

One-shot learning is a challenging problem where the aim is to recognize a class identified by a single training image. Given the practical importance of one-shot learning, it seems surprising that the rich information present in the class tag itself has largely been ignored. Most existing approaches restrict the use of the class tag to finding similar classes and transferring classifiers or metrics...

chapter

Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes

Tobias Pohlen, Alexander Hermans, Markus Mathias, Bastian Leibe

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3309 - 3318

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Semantic image segmentation is an essential component of modern autonomous driving systems, as an accurate understanding of the surrounding scene is crucial to navigation and action planning. Current state-of-the-art approaches in semantic image segmentation rely on pre-trained networks that were initially developed for classifying images as a whole. While these networks exhibit outstanding recognition...

chapter

Deep Cross-Modal Hashing

Qing-Yuan Jiang, Wu-Jun Li

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3270 - 3278

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Due to its low storage cost and fast query speed, cross-modal hashing (CMH) has been widely used for similarity search in multimedia retrieval applications. However, most existing CMH methods are based on hand-crafted features which might not be optimally compatible with the hash-code learning procedure. As a result, existing CMH methods with hand-crafted features may not achieve satisfactory performance...

chapter

Bayesian Supervised Hashing

Zihao Hu, Junxuan Chen, Hongtao Lu, Tongzhen Zhang

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3288 - 3295

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Among learning based hashing methods, supervised hashing seeks compact binary representation of the training data to preserve semantic similarities. Recent years have witnessed various problem formulations and optimization methods for supervised hashing. Most of them optimize a form of loss function with a regulization term, which can be viewed as a maximum a posterior (MAP) estimation of the hashing...

chapter

End-to-End Concept Word Detection for Video Captioning, Retrieval, and Question Answering

Youngjae Yu, Hyungjin Ko, Jongwook Choi, Gunhee Kim

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3261 - 3269

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We propose a high-level concept word detector that can be integrated with any video-to-language models. It takes a video as input and generates a list of concept words as useful semantic priors for language generation models. The proposed word detector has two important properties. First, it does not require any external knowledge sources for training. Second, the proposed word detector is trainable...

chapter

Hierarchical Boundary-Aware Neural Encoder for Video Captioning

Lorenzo Baraldi, Costantino Grana, Rita Cucchiara

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3185 - 3194

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

The use of Recurrent Neural Networks for video captioning has recently gained a lot of attention, since they can be used both to encode the input video and to generate the corresponding description. In this paper, we present a recurrent video encoding scheme which can discover and leverage the hierarchical structure of the video. Unlike the classical encoder-decoder approach, in which a video is encoded...

chapter

Temporal Action Localization by Structured Maximal Sums

Zehuan Yuan, Jonathan C. Stroud, Tong Lu, Jia Deng

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3215 - 3223

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We address the problem of temporal action localization in videos. We pose action localization as a structured prediction over arbitrary-length temporal windows, where each window is scored as the sum of frame-wise classification scores. Additionally, our model classifies the start, middle, and end of each action as separate components, allowing our system to explicitly model each actions temporal...

chapter

Scene Graph Generation by Iterative Message Passing

Danfei Xu, Yuke Zhu, Christopher B. Choy, Li Fei-Fei

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3097 - 3106

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Understanding a visual scene goes beyond recognizing individual objects in isolation. Relationships between objects also constitute rich semantic information about the scene. In this work, we explicitly model the objects and their relationships using scene graphs, a visually-grounded graphical structure of an image. We propose a novel end-to-end model that generates such structured scene representation...

chapter

Learning Cross-Modal Embeddings for Cooking Recipes and Food Images

Amaia Salvador, Nicholas Hynes, Yusuf Aytar, Javier Marin, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3068 - 3076

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

In this paper, we introduce Recipe1M, a new large-scale, structured corpus of over 1m cooking recipes and 800k food images. As the largest publicly available collection of recipe data, Recipe1M affords the ability to train high-capacity models on aligned, multi-modal data. Using these data, we train a neural network to find a joint embedding of recipes and images that yields impressive results on...

INFONA - science communication portal

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)