Search results

chapter

Cognitive exploration of regions through analyzing geo-tagged social media data

Yunzhe Wang, George Baciu, Chenhui Li

2017 IEEE 16th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC) > 59 - 64

2017 IEEE 16th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC)

Social media has now become a pervasive global communication channel. Many applications and platforms have become available for users to post messages, follow friends and share experiences. Due to the high frequency with which users update their states, a large amount of data is being generated around the world every second. By analyzing this data, valuable patterns can be extracted such as the distribution...

chapter

Fine-Grained Image Classification via Combining Vision and Language

Xiangteng He, Yuxin Peng

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 7332 - 7340

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Fine-grained image classification is a challenging task due to the large intra-class variance and small inter-class variance, aiming at recognizing hundreds of sub-categories belonging to the same basic-level category. Most existing fine-grained image classification methods generally learn part detection models to obtain the semantic parts for better classification accuracy. Despite achieving promising...

chapter

Network Dissection: Quantifying Interpretability of Deep Visual Representations

David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3319 - 3327

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We propose a general framework called Network Dissection for quantifying the interpretability of latent representations of CNNs by evaluating the alignment between individual hidden units and a set of semantic concepts. Given any CNN model, the proposed method draws on a data set of concepts to score the semantics of hidden units at each intermediate convolutional layer. The units with semantics are...

chapter

Learning Multifunctional Binary Codes for Both Category and Attribute Oriented Retrieval Tasks

Haomiao Liu, Ruiping Wang, Shiguang Shan, Xilin Chen

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6259 - 6268

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

In this paper we propose a unified framework to address multiple realistic image retrieval tasks concerning both category and attributes. Considering the scale of modern datasets, hashing is favorable for its low complexity. However, most existing hashing methods are designed to preserve one single kind of similarity, thus incapable of dealing with the different tasks simultaneously. To overcome this...

chapter

From Zero-Shot Learning to Conventional Supervised Classification: Unseen Visual Data Synthesis

Yang Long, Li Liu, Ling Shao, Fumin Shen, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6165 - 6174

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Robust object recognition systems usually rely on powerful feature extraction mechanisms from a large number of real images. However, in many realistic applications, collecting sufficient images for ever-growing new classes is unattainable. In this paper, we propose a new Zero-shot learning (ZSL) framework that can synthesise visual features for unseen classes without acquiring real images. Using...

chapter

Learning a Deep Embedding Model for Zero-Shot Learning

Li Zhang, Tao Xiang, Shaogang Gong

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3010 - 3019

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Zero-shot learning (ZSL) models rely on learning a joint embedding space where both textual/semantic description of object classes and visual representation of object images can be projected to for nearest neighbour search. Despite the success of deep neural networks that learn an end-to-end model between text and images in other vision problems such as image captioning, very few deep ZSL model exists...

chapter

Low-Rank Embedded Ensemble Semantic Dictionary for Zero-Shot Learning

Zhengming Ding, Ming Shao, Yun Fu

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6005 - 6013

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Zero-shot learning for visual recognition has received much interest in the most recent years. However, the semantic gap across visual features and their underlying semantics is still the biggest obstacle in zero-shot learning. To fight off this hurdle, we propose an effective Low-rank Embedded Semantic Dictionary learning (LESD) through ensemble strategy. Specifically, we formulate a novel framework...

chapter

Multi-context Attention for Human Pose Estimation

Xiao Chu, Wei Yang, Wanli Ouyang, Cheng Ma, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5669 - 5678

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

In this paper, we propose to incorporate convolutional neural networks with a multi-context attention mechanism into an end-to-end framework for human pose estimation. We adopt stacked hourglass networks to generate attention maps from features at multiple resolutions with various semantics. The Conditional Random Field (CRF) is utilized to model the correlations among neighboring regions in the attention...

chapter

Captioning Images with Diverse Objects

Subhashini Venugopalan, Lisa Anne Hendricks, Marcus Rohrbach, Raymond Mooney, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1170 - 1178

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Recent captioning models are limited in their ability to scale and describe concepts unseen in paired image-text corpora. We propose the Novel Object Captioner (NOC), a deep visual semantic captioning model that can describe a large number of object categories not present in existing image-caption datasets. Our model takes advantage of external sources – labeled images from object recognition...

chapter

Correlational Gaussian Processes for Cross-Domain Visual Recognition

Chengjiang Long, Gang Hua

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4932 - 4940

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We present a probabilistic model that captures higher order co-occurrence statistics for joint visual recognition in a collection of images and across multiple domains. More importantly, we predict the structured output across multiple domains by correlating outputs from the multi-classes Gaussian process classifiers in each individual domain. A set of correlational tensors is adopted to model the...

chapter

A Domain Based Approach to Social Relation Recognition

Qianru Sun, Bernt Schiele, Mario Fritz

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 435 - 444

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Social relations are the foundation of human daily life. Developing techniques to analyze such relations from visual data bears great potential to build machines that better understand us and are capable of interacting with us at a social level. Previous investigations have remained partial due to the overwhelming diversity and complexity of the topic and consequently have only focused on a handful...

chapter

Multi-level Attention Networks for Visual Question Answering

Dongfei Yu, Jianlong Fu, Tao Mei, Yong Rui

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4187 - 4195

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Inspired by the recent success of text-based question answering, visual question answering (VQA) is proposed to automatically answer natural language questions with the reference to a given image. Compared with text-based QA, VQA is more challenging because the reasoning process on visual domain needs both effective semantic embedding and fine-grained visual understanding. Existing approaches predominantly...

chapter

Semantic Regularisation for Recurrent Image Annotation

Feng Liu, Tao Xiang, Timothy M. Hospedales, Wankou Yang, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4160 - 4168

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

The CNN-RNN design pattern is increasingly widely applied in a variety of image annotation tasks including multi-label classification and captioning. Existing models use the weakly semantic CNN hidden layer or its transform as the image embedding that provides the interface between the CNN and RNN. This leaves the RNN overstretched with two jobs: predicting the visual concepts and modelling their...

chapter

Combining Bottom-Up, Top-Down, and Smoothness Cues for Weakly Supervised Image Segmentation

Anirban Roy, Sinisa Todorovic

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 7282 - 7291

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This paper addresses the problem of weakly supervised semantic image segmentation. Our goal is to label every pixel in a new image, given only image-level object labels associated with training images. Our problem statement differs from common semantic segmentation, where pixel-wise annotations are typically assumed available in training. We specify a novel deep architecture which fuses three distinct...

chapter

SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning

Long Chen, Hanwang Zhang, Jun Xiao, Liqiang Nie, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6298 - 6306

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Visual attention has been successfully applied in structural prediction tasks such as visual captioning and question answering. Existing visual attention models are generally spatial, i.e., the attention is modeled as spatial probabilities that re-weight the last conv-layer feature map of a CNN encoding an input image. However, we argue that such spatial attention does not necessarily conform to the...

chapter

Unsupervised Semantic Scene Labeling for Streaming Data

Maggie Wigness, John G. Rogers

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5910 - 5919

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We introduce an unsupervised semantic scene labeling approach that continuously learns and adapts semantic models discovered within a data stream. While closely related to unsupervised video segmentation, our algorithm is not designed to be an early video processing strategy that produces coherent over-segmentations, but instead, to directly learn higher-level semantic concepts. This is achieved with...

chapter

AnchorNet: A Weakly Supervised Network to Learn Geometry-Sensitive Features for Semantic Matching

David Novotny, Diane Larlus, Andrea Vedaldi

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2867 - 2876

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Despite significant progress of deep learning in recent years, state-of-the-art semantic matching methods still rely on legacy features such as SIFT or HoG. We argue that the strong invariance properties that are key to the success of recent deep architectures on the classification task make them unfit for dense correspondence tasks, unless a large amount of supervision is used. In this work, we propose...

chapter

CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Li Fei-Fei, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1988 - 1997

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

When building artificial intelligence systems that can reason and answer questions about visual data, we need diagnostic tests to analyze our progress and discover short-comings. Existing benchmarks for visual question answering can help, but have strong biases that models can exploit to correctly answer questions without reasoning. They also conflate multiple sources of error, making it hard to pinpoint...

chapter

Spatial-Semantic Image Search by Visual Feature Synthesis

Long Mai, Hailin Jin, Zhe Lin, Chen Fang, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1121 - 1130

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

The performance of image retrieval has been improved tremendously in recent years through the use of deep feature representations. Most existing methods, however, aim to retrieve images that are visually similar or semantically relevant to the query, irrespective of spatial configuration. In this paper, we develop a spatial-semantic image search technology that enables users to search for images with...

chapter

Semantic Compositional Networks for Visual Captioning

Zhe Gan, Chuang Gan, Xiaodong He, Yunchen Pu, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1141 - 1150

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

A Semantic Compositional Network (SCN) is developed for image captioning, in which semantic concepts (i.e., tags) are detected from the image, and the probability of each tag is used to compose the parameters in a long short-term memory (LSTM) network. The SCN extends each weight matrix of the LSTM to an ensemble of tag-dependent weight matrices. The degree to which each member of the ensemble is...

INFONA - science communication portal

Search results

Cognitive exploration of regions through analyzing geo-tagged social media data

Fine-Grained Image Classification via Combining Vision and Language

Network Dissection: Quantifying Interpretability of Deep Visual Representations

Learning Multifunctional Binary Codes for Both Category and Attribute Oriented Retrieval Tasks

From Zero-Shot Learning to Conventional Supervised Classification: Unseen Visual Data Synthesis

Learning a Deep Embedding Model for Zero-Shot Learning

Low-Rank Embedded Ensemble Semantic Dictionary for Zero-Shot Learning

Multi-context Attention for Human Pose Estimation

Captioning Images with Diverse Objects

Correlational Gaussian Processes for Cross-Domain Visual Recognition

A Domain Based Approach to Social Relation Recognition

Multi-level Attention Networks for Visual Question Answering

Semantic Regularisation for Recurrent Image Annotation

Combining Bottom-Up, Top-Down, and Smoothness Cues for Weakly Supervised Image Segmentation

SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning

Unsupervised Semantic Scene Labeling for Streaming Data

AnchorNet: A Weakly Supervised Network to Learn Geometry-Sensitive Features for Semantic Matching

CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

Spatial-Semantic Image Search by Visual Feature Synthesis

Semantic Compositional Networks for Visual Captioning

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options