2017 IEEE International Conference on Computer Vision (ICCV)

chapter

High-Resolution Shape Completion Using Deep Neural Networks for Global Structure and Local Geometry Inference

Xiaoguang Han, Zhen Li, Haibin Huang, Evangelos Kalogerakis, more

2017 IEEE International Conference on Computer Vision (ICCV) > 85 - 93

We propose a data-driven method for recovering missing parts of 3D shapes. Our method is based on a new deep learning architecture consisting of two sub-networks: a global structure inference network and a local geometry refinement network. The global structure inference network incorporates a long short-term memorized context fusion module (LSTM-CF) that infers the global structure of the shape based...

chapter

S^3FD: Single Shot Scale-Invariant Face Detector

Shifeng Zhang, Xiangyu Zhu, Zhen Lei, Hailin Shi, more

2017 IEEE International Conference on Computer Vision (ICCV) > 192 - 201

2017 IEEE International Conference on Computer Vision (ICCV)

This paper presents a real-time face detector, named Single Shot Scale-invariant Face Detector (S3FD), which performs superiorly on various scales of faces with a single deep neural network, especially for small faces. Specifically, we try to solve the common problem that anchorbased detectors deteriorate dramatically as the objects become smaller. We make contributions in the following three aspects:...

chapter

Amulet: Aggregating Multi-level Convolutional Features for Salient Object Detection

Pingping Zhang, Dong Wang, Huchuan Lu, Hongyu Wang, more

2017 IEEE International Conference on Computer Vision (ICCV) > 202 - 211

2017 IEEE International Conference on Computer Vision (ICCV)

Fully convolutional neural networks (FCNs) have shown outstanding performance in many dense labeling problems. One key pillar of these successes is mining relevant information from features in convolutional layers. However, how to better aggregate multi-level convolutional feature maps for salient object detection is underexplored. In this work, we present Amulet, a generic aggregating multi-level...

chapter

Learning Uncertain Convolutional Features for Accurate Saliency Detection

Pingping Zhang, Dong Wang, Huchuan Lu, Hongyu Wang, more

2017 IEEE International Conference on Computer Vision (ICCV) > 212 - 221

2017 IEEE International Conference on Computer Vision (ICCV)

Deep convolutional neural networks (CNNs) have delivered superior performance in many computer vision tasks. In this paper, we propose a novel deep fully convolutional network model for accurate salient object detection. The key contribution of this work is to learn deep uncertain convolutional features (UCF), which encourage the robustness and accuracy of saliency detection. We achieve this via introducing...

chapter

Encouraging LSTMs to Anticipate Actions Very Early

Mohammad Sadegh Aliakbarian, Fatemeh Sadat Saleh, Mathieu Salzmann, Basura Fernando, more

2017 IEEE International Conference on Computer Vision (ICCV) > 280 - 289

2017 IEEE International Conference on Computer Vision (ICCV)

In contrast to the widely studied problem of recognizing an action given a complete sequence, action anticipation aims to identify the action from only partially available videos. As such, it is therefore key to the success of computer vision applications requiring to react as early as possible, such as autonomous navigation. In this paper, we propose a new action anticipation method that achieves...

chapter

Tracking the Untrackable: Learning to Track Multiple Cues with Long-Term Dependencies

Amir Sadeghian, Alexandre Alahi, Silvio Savarese

2017 IEEE International Conference on Computer Vision (ICCV) > 300 - 311

2017 IEEE International Conference on Computer Vision (ICCV)

The majority of existing solutions to the Multi-Target Tracking (MTT) problem do not combine cues over a long period of time in a coherent fashion. In this paper, we present an online method that encodes long-term temporal dependencies across multiple cues. One key challenge of tracking methods is to accurately track occluded targets or those which share similar appearance properties with surrounding...

chapter

A Revisit of Sparse Coding Based Anomaly Detection in Stacked RNN Framework

Weixin Luo, Wen Liu, Shenghua Gao

2017 IEEE International Conference on Computer Vision (ICCV) > 341 - 349

2017 IEEE International Conference on Computer Vision (ICCV)

Motivated by the capability of sparse coding based anomaly detection, we propose a Temporally-coherent Sparse Coding (TSC) where we enforce similar neighbouring frames be encoded with similar reconstruction coefficients. Then we map the TSC with a special type of stacked Recurrent Neural Network (sRNN). By taking advantage of sRNN in learning all parameters simultaneously, the nontrivial hyper-parameter...

chapter

HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis

Xihui Liu, Haiyu Zhao, Maoqing Tian, Lu Sheng, more

2017 IEEE International Conference on Computer Vision (ICCV) > 350 - 359

2017 IEEE International Conference on Computer Vision (ICCV)

Pedestrian analysis plays a vital role in intelligent video surveillance and is a key component for security-centric computer vision systems. Despite that the convolutional neural networks are remarkable in learning discriminative features from images, the learning of comprehensive features of pedestrians for fine-grained tasks remains an open problem. In this study, we propose a new attentionbased...

chapter

Orientation Invariant Feature Embedding and Spatial Temporal Regularization for Vehicle Re-identification

Zhongdao Wang, Luming Tang, Xihui Liu, Zhuliang Yao, more

2017 IEEE International Conference on Computer Vision (ICCV) > 379 - 387

2017 IEEE International Conference on Computer Vision (ICCV)

In this paper, we tackle the vehicle Re-identification (ReID) problem which is of great importance in urban surveillance and can be used for multiple applications. In our vehicle ReID framework, an orientation invariant feature embedding module and a spatial-temporal regularization module are proposed. With orientation invariant feature embedding, local region features of different orientations can...

chapter

Flow-Guided Feature Aggregation for Video Object Detection

Xizhou Zhu, Yujie Wang, Jifeng Dai, Lu Yuan, more

2017 IEEE International Conference on Computer Vision (ICCV) > 408 - 417

2017 IEEE International Conference on Computer Vision (ICCV)

Extending state-of-the-art object detectors from image to video is challenging. The accuracy of detection suffers from degenerated object appearances in videos, e.g., motion blur, video defocus, rare poses, etc. Existing work attempts to exploit temporal information on box level, but such methods are not trained end-to-end. We present flow-guided feature aggregation, an accurate and end-to-end learning...

chapter

DeNet: Scalable Real-Time Object Detection with Directed Sparse Sampling

Lachlan Tychsen-Smith, Lars Petersson

2017 IEEE International Conference on Computer Vision (ICCV) > 428 - 436

2017 IEEE International Conference on Computer Vision (ICCV)

We define the object detection from imagery problem as estimating a very large but extremely sparse bounding box dependent probability distribution. Subsequently we identify a sparse distribution estimation scheme, Directed Sparse Sampling, and employ it in a single end-to-end CNN based detection model. This methodology extends and formalizes previous state-of-the-art detection models with an additional...

chapter

Multi-label Image Recognition by Recurrently Discovering Attentional Regions

Zhouxia Wang, Tianshui Chen, Guanbin Li, Ruijia Xu, more

2017 IEEE International Conference on Computer Vision (ICCV) > 464 - 472

2017 IEEE International Conference on Computer Vision (ICCV)

This paper proposes a novel deep architecture to address multi-label image recognition, a fundamental and practical task towards general visual understanding. Current solutions for this task usually rely on an extra step of extracting hypothesis regions (i.e., region proposals), resulting in redundant computation and sub-optimal performance. In this work, we achieve the interpretable and contextualized...

chapter

DualNet: Learn Complementary Features for Image Recognition

Saihui Hou, Xu Liu, Zilei Wang

2017 IEEE International Conference on Computer Vision (ICCV) > 502 - 510

2017 IEEE International Conference on Computer Vision (ICCV)

In this work we propose a novel framework named Dual-Net aiming at learning more accurate representation for image recognition. Here two parallel neural networks are coordinated to learn complementary features and thus a wider network is constructed. Specifically, we logically divide an end-to-end deep convolutional neural network into two functional parts, i.e., feature extractor and image classifier...

chapter

Recurrent Scale Approximation for Object Detection in CNN

Yu Liu, Hongyang Li, Junjie Yan, Fangyin Wei, more

2017 IEEE International Conference on Computer Vision (ICCV) > 571 - 579

2017 IEEE International Conference on Computer Vision (ICCV)

Since convolutional neural network (CNN) lacks an inherent mechanism to handle large scale variations, we always need to compute feature maps multiple times for multiscale object detection, which has the bottleneck of computational cost in practice. To address this, we devise a recurrent scale approximation (RSA) to compute feature map once only, and only through this map can we approximate the rest...

chapter

Embedding 3D Geometric Features for Rigid Object Part Segmentation

Yafei Song, Xiaowu Chen, Jia Li, Qinping Zhao

2017 IEEE International Conference on Computer Vision (ICCV) > 580 - 588

2017 IEEE International Conference on Computer Vision (ICCV)

Object part segmentation is a challenging and fundamental problem in computer vision. Its difficulties may be caused by the varying viewpoints, poses, and topological structures, which can be attributed to an essential reason, i.e., a specific object is a 3D model rather than a 2D figure. Therefore, we conjecture that not only 2D appearance features but also 3D geometric features could be helpful...

chapter

Towards Context-Aware Interaction Recognition for Visual Relationship Detection

Bohan Zhuang, Lingqiao Liu, Chunhua Shen, Ian Reid

2017 IEEE International Conference on Computer Vision (ICCV) > 589 - 598

2017 IEEE International Conference on Computer Vision (ICCV)

Recognizing how objects interact with each other is a crucial task in visual recognition. If we define the context of the interaction to be the objects involved, then most current methods can be categorized as either: (i) training a single classifier on the combination of the interaction and its context; or (ii) aiming to recognize the interaction independently of its explicit context. Both methods...

chapter

Look, Listen and Learn

Relja Arandjelovic, Andrew Zisserman

2017 IEEE International Conference on Computer Vision (ICCV) > 609 - 617

2017 IEEE International Conference on Computer Vision (ICCV)

We consider the question: what can be learnt by looking at and listening to a large number of unlabelled videos? There is a valuable, but so far untapped, source of information contained in the video itself – the correspondence between the visual and the audio streams, and we introduce a novel “Audio-Visual Correspondence” learning task that makes use of this. Training visual and audio networks from...

chapter

Image-Based Localization Using LSTMs for Structured Feature Correlation

F. Walch, C. Hazirbas, L. Leal-Taixe, T. Sattler, more

2017 IEEE International Conference on Computer Vision (ICCV) > 627 - 637

2017 IEEE International Conference on Computer Vision (ICCV)

In this work we propose a new CNN+LSTM architecture for camera pose regression for indoor and outdoor scenes. CNNs allow us to learn suitable feature representations for localization that are robust against motion blur and illumination changes. We make use of LSTM units on the CNN output, which play the role of a structured dimensionality reduction on the feature vector, leading to drastic improvements...

chapter

Unsupervised Representation Learning by Sorting Sequences

Hsin-Ying Lee, Jia-Bin Huang, Maneesh Singh, Ming-Hsuan Yang

2017 IEEE International Conference on Computer Vision (ICCV) > 667 - 676

2017 IEEE International Conference on Computer Vision (ICCV)

We present an unsupervised representation learning approach using videos without semantic labels. We leverage the temporal coherence as a supervisory signal by formulating representation learning as a sequence sorting task. We take temporally shuffled frames (i.e., in non-chronological order) as inputs and train a convolutional neural network to sort the shuffled sequences. Similar to comparison-based...

chapter

Unsupervised Action Discovery and Localization in Videos

Khurram Soomro, Mubarak Shah

2017 IEEE International Conference on Computer Vision (ICCV) > 696 - 705

2017 IEEE International Conference on Computer Vision (ICCV)

This paper is the first to address the problem of unsupervised action localization in videos. Given unlabeled data without bounding box annotations, we propose a novel approach that: 1) Discovers action class labels and 2) Spatio-temporally localizes actions in videos. It begins by computing local video features to apply spectral clustering on a set of unlabeled training videos. For each cluster of...

INFONA - science communication portal

2017 IEEE International Conference on Computer Vision (ICCV)

High-Resolution Shape Completion Using Deep Neural Networks for Global Structure and Local Geometry Inference

S^3FD: Single Shot Scale-Invariant Face Detector

Amulet: Aggregating Multi-level Convolutional Features for Salient Object Detection

Learning Uncertain Convolutional Features for Accurate Saliency Detection

Encouraging LSTMs to Anticipate Actions Very Early

Tracking the Untrackable: Learning to Track Multiple Cues with Long-Term Dependencies

A Revisit of Sparse Coding Based Anomaly Detection in Stacked RNN Framework

HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis

Orientation Invariant Feature Embedding and Spatial Temporal Regularization for Vehicle Re-identification

Flow-Guided Feature Aggregation for Video Object Detection

DeNet: Scalable Real-Time Object Detection with Directed Sparse Sampling

Multi-label Image Recognition by Recurrently Discovering Attentional Regions

DualNet: Learn Complementary Features for Image Recognition

Recurrent Scale Approximation for Object Detection in CNN

Embedding 3D Geometric Features for Rigid Object Part Segmentation

Towards Context-Aware Interaction Recognition for Visual Relationship Detection

Look, Listen and Learn

Image-Based Localization Using LSTMs for Structured Feature Correlation

Unsupervised Representation Learning by Sorting Sequences

Unsupervised Action Discovery and Localization in Videos

Filter options

Publication date

Keywords

INFONA - science communication portal

2017 IEEE International Conference on Computer Vision (ICCV) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2017 IEEE International Conference on Computer Vision (ICCV)