2017 IEEE International Conference on Computer Vision (ICCV)

chapter

No Fuss Distance Metric Learning Using Proxies

Yair Movshovitz-Attias, Alexander Toshev, Thomas K. Leung, Sergey Ioffe, more

2017 IEEE International Conference on Computer Vision (ICCV) > 360 - 368

We address the problem of distance metric learning (DML), defined as learning a distance consistent with a notion of semantic similarity. Traditionally, for this problem supervision is expressed in the form of sets of points that follow an ordinal relationship – an anchor point x is similar to a set of positive points Y , and dissimilar to a set of negative points Z, and a loss defined over these...

chapter

DualNet: Learn Complementary Features for Image Recognition

Saihui Hou, Xu Liu, Zilei Wang

2017 IEEE International Conference on Computer Vision (ICCV) > 502 - 510

2017 IEEE International Conference on Computer Vision (ICCV)

In this work we propose a novel framework named Dual-Net aiming at learning more accurate representation for image recognition. Here two parallel neural networks are coordinated to learn complementary features and thus a wider network is constructed. Specifically, we logically divide an end-to-end deep convolutional neural network into two functional parts, i.e., feature extractor and image classifier...

chapter

VegFru: A Domain-Specific Dataset for Fine-Grained Visual Categorization

Saihui Hou, Yushan Feng, Zilei Wang

2017 IEEE International Conference on Computer Vision (ICCV) > 541 - 549

2017 IEEE International Conference on Computer Vision (ICCV)

In this paper, we propose a novel domain-specific dataset named VegFru for fine-grained visual categorization (FGVC). While the existing datasets for FGVC are mainly focused on animal breeds or man-made objects with limited labelled data, VegFru is a larger dataset consisting of vegetables and fruits which are closely associated with the daily life of everyone. Aiming at domestic cooking and food...

chapter

Unsupervised Action Discovery and Localization in Videos

Khurram Soomro, Mubarak Shah

2017 IEEE International Conference on Computer Vision (ICCV) > 696 - 705

2017 IEEE International Conference on Computer Vision (ICCV)

This paper is the first to address the problem of unsupervised action localization in videos. Given unlabeled data without bounding box annotations, we propose a novel approach that: 1) Discovers action class labels and 2) Spatio-temporally localizes actions in videos. It begins by computing local video features to apply spectral clustering on a set of unlabeled training videos. For each cluster of...

chapter

Open Set Domain Adaptation

Pau Panareda Busto, Juergen Gall

2017 IEEE International Conference on Computer Vision (ICCV) > 754 - 763

2017 IEEE International Conference on Computer Vision (ICCV)

When the training and the test data belong to different domains, the accuracy of an object classifier is significantly reduced. Therefore, several algorithms have been proposed in the last years to diminish the so called domain shift between datasets. However, all available evaluation protocols for domain adaptation describe a closed set recognition task, where both domains, namely source and target,...

chapter

How Far are We from Solving the 2D & 3D Face Alignment Problem? (and a Dataset of 230,000 3D Facial Landmarks)

Adrian Bulat, Georgios Tzimiropoulos

2017 IEEE International Conference on Computer Vision (ICCV) > 1021 - 1030

2017 IEEE International Conference on Computer Vision (ICCV)

This paper investigates how far a very deep neural network is from attaining close to saturating performance on existing 2D and 3D face alignment datasets. To this end, we make the following 5 contributions: (a) we construct, for the first time, a very strong baseline by combining a state-of-the-art architecture for landmark localization with a state-of-the-art residual block, train it on a very large...

chapter

RankIQA: Learning from Rankings for No-Reference Image Quality Assessment

Xialei Liu, Joost van de Weijer, Andrew D. Bagdanov

2017 IEEE International Conference on Computer Vision (ICCV) > 1040 - 1049

2017 IEEE International Conference on Computer Vision (ICCV)

We propose a no-reference image quality assessment (NR-IQA) approach that learns from rankings (RankIQA). To address the problem of limited IQA dataset size, we train a Siamese Network to rank images in terms of image quality by using synthetically generated distortions for which relative image quality is known. These ranked image sets can be automatically generated without laborious human labeling...

chapter

Learning Feature Pyramids for Human Pose Estimation

Wei Yang, Shuang Li, Wanli Ouyang, Hongsheng Li, more

2017 IEEE International Conference on Computer Vision (ICCV) > 1290 - 1299

2017 IEEE International Conference on Computer Vision (ICCV)

Articulated human pose estimation is a fundamental yet challenging task in computer vision. The difficulty is particularly pronounced in scale variations of human body parts when camera view changes or severe foreshortening happens. Although pyramid methods are widely used to handle scale changes at inference time, learning feature pyramids in deep convolutional neural networks (DCNNs) is still not...

chapter

Fine-Grained Recognition in the Wild: A Multi-task Domain Adaptation Approach

Timnit Gebru, Judy Hoffman, Li Fei-Fei

2017 IEEE International Conference on Computer Vision (ICCV) > 1358 - 1367

2017 IEEE International Conference on Computer Vision (ICCV)

While fine-grained object recognition is an important problem in computer vision, current models are unlikely to accurately classify objects in the wild. These fully supervised models need additional annotated images to classify objects in every new scenario, a task that is infeasible. However, sources such as e-commerce websites and field guides provide annotated images for many classes. In this...

chapter

Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes

Yang Zhang, Philip David, Boqing Gong

2017 IEEE International Conference on Computer Vision (ICCV) > 2039 - 2049

2017 IEEE International Conference on Computer Vision (ICCV)

During the last half decade, convolutional neural networks (CNNs) have triumphed over semantic segmentation, which is a core task of various emerging industrial applications such as autonomous driving and medical imaging. However, to train CNNs requires a huge amount of data, which is difficult to collect and laborious to annotate. Recent advances in computer graphics make it possible to train CNN...

chapter

Scale-Adaptive Convolutions for Scene Parsing

Rui Zhang, Sheng Tang, Yongdong Zhang, Jintao Li, more

2017 IEEE International Conference on Computer Vision (ICCV) > 2050 - 2058

2017 IEEE International Conference on Computer Vision (ICCV)

Many existing scene parsing methods adopt Convolutional Neural Networks with fixed-size receptive fields, which frequently result in inconsistent predictions of large objects and invisibility of small objects. To tackle this issue, we propose a scale-adaptive convolution to acquire flexiblesize receptive fields during scene parsing. Through adding a new scale regression layer, we can dynamically infer...

chapter

Factorized Bilinear Models for Image Recognition

Yanghao Li, Naiyan Wang, Jiaying Liu, Xiaodi Hou

2017 IEEE International Conference on Computer Vision (ICCV) > 2098 - 2106

2017 IEEE International Conference on Computer Vision (ICCV)

Although Deep Convolutional Neural Networks (CNNs) have liberated their power in various computer vision tasks, the most important components of CNN, convolutional layers and fully connected layers, are still limited to linear transformations. In this paper, we propose a novel Factorized Bilinear (FB) layer to model the pairwise feature interactions by considering the quadratic terms in the transformations...

chapter

RMPE: Regional Multi-person Pose Estimation

Hao-Shu Fang, Shuqin Xie, Yu-Wing Tai, Cewu Lu

2017 IEEE International Conference on Computer Vision (ICCV) > 2353 - 2362

2017 IEEE International Conference on Computer Vision (ICCV)

Multi-person pose estimation in the wild is challenging. Although state-of-the-art human detectors have demonstrated good performance, small errors in localization and recognition are inevitable. These errors can cause failures for a single-person pose estimator (SPPE), especially for methods that solely depend on human detection results. In this paper, we propose a novel regional multi-person pose...

chapter

Deep Metric Learning with Angular Loss

Jian Wang, Feng Zhou, Shilei Wen, Xiao Liu, more

2017 IEEE International Conference on Computer Vision (ICCV) > 2612 - 2620

2017 IEEE International Conference on Computer Vision (ICCV)

The modern image search system requires semantic understanding of image, and a key yet under-addressed problem is to learn a good metric for measuring the similarity between images. While deep metric learning has yielded impressive performance gains by extracting high level abstractions from image data, a proper objective loss function becomes the central issue to boost the performance. In this paper,...

chapter

Sampling Matters in Deep Embedding Learning

R. Manmatha, Chao-Yuan Wu, Alexander J. Smola, Philipp Krahenbuhl

2017 IEEE International Conference on Computer Vision (ICCV) > 2859 - 2867

2017 IEEE International Conference on Computer Vision (ICCV)

Deep embeddings answer one simple question: How similar are two images? Learning these embeddings is the bedrock of verification, zero-shot learning, and visual search. The most prominent approaches optimize a deep convolutional network with a suitable loss function, such as contrastive loss or triplet loss. While a rich line of work focuses solely on the loss functions, we show in this paper that...

chapter

DualGAN: Unsupervised Dual Learning for Image-to-Image Translation

Zili Yi, Hao Zhang, Ping Tan, Minglun Gong

2017 IEEE International Conference on Computer Vision (ICCV) > 2868 - 2876

2017 IEEE International Conference on Computer Vision (ICCV)

Conditional Generative Adversarial Networks (GANs) for cross-domain image-to-image translation have made much progress recently [7, 8, 21, 12, 4, 18]. Depending on the task complexity, thousands to millions of labeled image pairs are needed to train a conditional GAN. However, human labeling is expensive, even impractical, and large quantities of data may not always be available. Inspired by dual...

chapter

Unmasking the Abnormal Events in Video

Radu Tudor Ionescu, Sorina Smeureanu, Bogdan Alexe, Marius Popescu

2017 IEEE International Conference on Computer Vision (ICCV) > 2914 - 2922

2017 IEEE International Conference on Computer Vision (ICCV)

We propose a novel framework for abnormal event detection in video that requires no training sequences. Our framework is based on unmasking, a technique previously used for authorship verification in text documents, which we adapt to our task. We iteratively train a binary classifier to distinguish between two consecutive video sequences while removing at each step the most discriminant features....

chapter

Focal Loss for Dense Object Detection

Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, more

2017 IEEE International Conference on Computer Vision (ICCV) > 2999 - 3007

2017 IEEE International Conference on Computer Vision (ICCV)

The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations. In contrast, one-stage detectors that are applied over a regular, dense sampling of possible object locations have the potential to be faster and simpler, but have trailed the accuracy of two-stage detectors thus far. In...

chapter

Curriculum Dropout

Pietro Morerio, Jacopo Cavazza, Riccardo Volpi, Rene Vidal, more

2017 IEEE International Conference on Computer Vision (ICCV) > 3564 - 3572

2017 IEEE International Conference on Computer Vision (ICCV)

Dropout is a very effective way of regularizing neural networks. Stochastically “dropping out” units with a certain probability discourages over-specific co-adaptations of feature detectors, preventing overfitting and improving network generalization. Besides, Dropout can be interpreted as an approximate model aggregation technique, where an exponential number of smaller networks are averaged in order...

chapter

Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Resources

Adrian Bulat, Georgios Tzimiropoulos

2017 IEEE International Conference on Computer Vision (ICCV) > 3726 - 3734

2017 IEEE International Conference on Computer Vision (ICCV)

Our goal is to design architectures that retain the groundbreaking performance of CNNs for landmark localization and at the same time are lightweight, compact and suitable for applications with limited computational resources. To this end, we make the following contributions: (a) we are the first to study the effect of neural network binarization on localization tasks, namely human pose estimation...

INFONA - science communication portal

2017 IEEE International Conference on Computer Vision (ICCV)

No Fuss Distance Metric Learning Using Proxies

DualNet: Learn Complementary Features for Image Recognition

VegFru: A Domain-Specific Dataset for Fine-Grained Visual Categorization

Unsupervised Action Discovery and Localization in Videos

Open Set Domain Adaptation

How Far are We from Solving the 2D & 3D Face Alignment Problem? (and a Dataset of 230,000 3D Facial Landmarks)

RankIQA: Learning from Rankings for No-Reference Image Quality Assessment

Learning Feature Pyramids for Human Pose Estimation

Fine-Grained Recognition in the Wild: A Multi-task Domain Adaptation Approach

Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes

Scale-Adaptive Convolutions for Scene Parsing

Factorized Bilinear Models for Image Recognition

RMPE: Regional Multi-person Pose Estimation

Deep Metric Learning with Angular Loss

Sampling Matters in Deep Embedding Learning

DualGAN: Unsupervised Dual Learning for Image-to-Image Translation

Unmasking the Abnormal Events in Video

Focal Loss for Dense Object Detection

Curriculum Dropout

Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Resources

Filter options

Publication date

Keywords

INFONA - science communication portal

2017 IEEE International Conference on Computer Vision (ICCV) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2017 IEEE International Conference on Computer Vision (ICCV)