Search results

chapter

Deep Learning of Human Visual Sensitivity in Image Quality Assessment Framework

Jongyoo Kim, Sanghoon Lee

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1969 - 1977

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Since human observers are the ultimate receivers of digital images, image quality metrics should be designed from a human-oriented perspective. Conventionally, a number of full-reference image quality assessment (FR-IQA) methods adopted various computational models of the human visual system (HVS) from psychological vision science research. In this paper, we propose a novel convolutional neural networks...

chapter

Semantic Autoencoder for Zero-Shot Learning

Elyor Kodirov, Tao Xiang, Shaogang Gong

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4447 - 4456

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Existing zero-shot learning (ZSL) models typically learn a projection function from a feature space to a semantic embedding space (e.g. attribute space). However, such a projection function is only concerned with predicting the training seen class semantic representation (e.g. attribute prediction) or classification. When applied to test data, which in the context of ZSL contains different (unseen)...

chapter

Seeing into Darkness: Scotopic Visual Recognition

Bo Chen, Pietro Perona

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 7292 - 7301

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Images are formed by counting how many photons traveling from a given set of directions hit an image sensor during a given time interval. When photons are few and far in between, the concept of image breaks down and it is best to consider directly the flow of photons. Computer vision in this regime, which we call scotopic, is radically different from the classical image-based paradigm in that visual...

chapter

SST: Single-Stream Temporal Action Proposals

Shyamal Buch, Victor Escorcia, Chuanqi Shen, Bernard Ghanem, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6373 - 6382

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Our paper presents a new approach for temporal detection of human actions in long, untrimmed video sequences. We introduce Single-Stream Temporal Action Proposals (SST), a new effective and efficient deep architecture for the generation of temporal action proposals. Our network can run continuously in a single stream over very long input video sequences, without the need to divide input into short...

chapter

Knowledge Acquisition for Visual Question Answering via Iterative Querying

Yuke Zhu, Joseph J. Lim, Li Fei-Fei

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6146 - 6155

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Humans possess an extraordinary ability to learn new skills and new knowledge for problem solving. Such learning ability is also required by an automatic model to deal with arbitrary, open-ended questions in the visual world. We propose a neural-based approach to acquiring task-driven information for visual question answering (VQA). Our model proposes queries to actively acquire relevant information...

chapter

Hyper-Laplacian Regularized Unidirectional Low-Rank Tensor Recovery for Multispectral Image Denoising

Yi Chang, Luxin Yan, Sheng Zhong

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5901 - 5909

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Recent low-rank based matrix/tensor recovery methods have been widely explored in multispectral images (MSI) denoising. These methods, however, ignore the difference of the intrinsic structure correlation along spatial sparsity, spectral correlation and non-local self-similarity mode. In this paper, we go further by giving a detailed analysis about the rank properties both in matrix and tensor cases,...

chapter

Semantically Consistent Regularization for Zero-Shot Recognition

Pedro Morgado, Nuno Vasconcelos

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2037 - 2046

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

The role of semantics in zero-shot learning is considered. The effectiveness of previous approaches is analyzed according to the form of supervision provided. While some learn semantics independently, others only supervise the semantic subspace explained by training classes. Thus, the former is able to constrain the whole space but lacks the ability to model semantic correlations. The latter addresses...

chapter

A Joint Speaker-Listener-Reinforcer Model for Referring Expressions

Licheng Yu, Hao Tan, Mohit Bansal, Tamara L. Berg

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3521 - 3529

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Referring expressions are natural language constructions used to identify particular objects within a scene. In this paper, we propose a unified framework for the tasks of referring expression comprehension and generation. Our model is composed of three modules: speaker, listener, and reinforcer. The speaker generates referring expressions, the listener comprehends referring expressions, and the reinforcer...

chapter

Attentional Push: A Deep Convolutional Network for Augmenting Image Salience with Shared Attention Modeling in Social Scenes

Siavash Gorji, James J. Clark

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3472 - 3481

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We present a novel visual attention tracking technique based on Shared Attention modeling. By considering the viewer as a participant in the activity occurring in the scene, our model learns the loci of attention of the scene actors and use it to augment image salience. We go beyond image salience and instead of only computing the power of image regions to pull attention, we also consider the strength...

chapter

A Dataset and Exploration of Models for Understanding Video Data through Fill-in-the-Blank Question-Answering

Tegan Maharaj, Nicolas Ballas, Anna Rohrbach, Aaron Courville, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 7359 - 7368

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

While deep convolutional neural networks frequently approach or exceed human-level performance in benchmark tasks involving static images, extending this success to moving images is not straightforward. Video understanding is of interest for many applications, including content recommendation, prediction, summarization, event/object detection, and understanding human visual perception. However, many...

chapter

Learning a Deep Embedding Model for Zero-Shot Learning

Li Zhang, Tao Xiang, Shaogang Gong

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3010 - 3019

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Zero-shot learning (ZSL) models rely on learning a joint embedding space where both textual/semantic description of object classes and visual representation of object images can be projected to for nearest neighbour search. Despite the success of deep neural networks that learn an end-to-end model between text and images in other vision problems such as image captioning, very few deep ZSL model exists...

chapter

Hidden Layers in Perceptual Learning

Gad Cohen, Daphna Weinshall

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5349 - 5357

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Studies in visual perceptual learning investigate the way human performance improves with practice, in the context of relatively simple (and therefore more manageable) visual tasks. Building on the powerful tools currently available for the training of Convolution Neural Networks (CNN), networks whose original architecture was inspired by the visual system, we revisited some of the open computational...

chapter

A Domain Based Approach to Social Relation Recognition

Qianru Sun, Bernt Schiele, Mario Fritz

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 435 - 444

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Social relations are the foundation of human daily life. Developing techniques to analyze such relations from visual data bears great potential to build machines that better understand us and are capable of interacting with us at a social level. Previous investigations have remained partial due to the overwhelming diversity and complexity of the topic and consequently have only focused on a handful...

chapter

Deep Quantization: Encoding Convolutional Activations with Deep Generative Model

Zhaofan Qiu, Ting Yao, Tao Mei

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4085 - 4094

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Deep convolutional neural networks (CNNs) have proven highly effective for visual recognition, where learning a universal representation from activations of convolutional layer plays a fundamental problem. In this paper, we present Fisher Vector encoding with Variational Auto-Encoder (FV-VAE), a novel deep architecture that quantizes the local activations of convolutional layer in a deep generative...

chapter

MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network

Zizhao Zhang, Yuanpu Xie, Fuyong Xing, Mason McGough, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3549 - 3557

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

The inability to interpret the model prediction in semantically and visually meaningful ways is a well-known shortcoming of most existing computer-aided diagnosis methods. In this paper, we propose MDNet to establish a direct multimodal mapping between medical images and diagnostic reports that can read images, generate diagnostic reports, retrieve images by symptom descriptions, and visualize attention,...

chapter

Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning

Jiasen Lu, Caiming Xiong, Devi Parikh, Richard Socher

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3242 - 3250

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Attention-based neural encoder-decoder frameworks have been widely adopted for image captioning. Most methods force visual attention to be active for every generated word. However, the decoder likely requires little to no visual information from the image to predict non-visual words such as the and of. Other words that may seem visual can often be predicted reliably just from the language model e...

chapter

What's in a Question: Using Visual Questions as a Form of Supervision

Siddha Ganju, Olga Russakovsky, Abhinav Gupta

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6422 - 6431

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Collecting fully annotated image datasets is challenging and expensive. Many types of weak supervision have been explored: weak manual annotations, web search results, temporal continuity, ambient sound and others. We focus on one particular unexplored mode: visual questions that are asked about images. The key observation that inspires our work is that the question itself provides useful information...

chapter

Top-Down Visual Saliency Guided by Captions

Vasili Ramanishka, Abir Das, Jianming Zhang, Kate Saenko

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3135 - 3144

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Neural image/video captioning models can generate accurate descriptions, but their internal process of mapping regions to words is a black box and therefore difficult to explain. Top-down neural saliency methods can find important regions given a high-level semantic task such as object classification, but cannot use a natural language sentence as the top-down input for the task. In this paper, we...

chapter

DeepPermNet: Visual Permutation Learning

Rodrigo Santa Cruz, Basura Fernando, Anoop Cherian, Stephen Gould

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6044 - 6052

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We present a principled approach to uncover the structure of visual data by solving a novel deep learning task coined visual permutation learning. The goal of this task is to find the permutation that recovers the structure of data from shuffled versions of it. In the case of natural images, this task boils down to recovering the original image from patches shuffled by an unknown permutation matrix...

chapter

Improving Interpretability of Deep Neural Networks with Semantic Information

Yinpeng Dong, Hang Su, Jun Zhu, Bo Zhang

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 975 - 983

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Interpretability of deep neural networks (DNNs) is essential since it enables users to understand the overall strengths and weaknesses of the models, conveys an understanding of how the models will behave in the future, and how to diagnose and correct potential problems. However, it is challenging to reason about what a DNN actually does due to its opaque or black-box nature. To address this issue,...

INFONA - science communication portal

Search results

Deep Learning of Human Visual Sensitivity in Image Quality Assessment Framework

Semantic Autoencoder for Zero-Shot Learning

Seeing into Darkness: Scotopic Visual Recognition

SST: Single-Stream Temporal Action Proposals

Knowledge Acquisition for Visual Question Answering via Iterative Querying

Hyper-Laplacian Regularized Unidirectional Low-Rank Tensor Recovery for Multispectral Image Denoising

Semantically Consistent Regularization for Zero-Shot Recognition

A Joint Speaker-Listener-Reinforcer Model for Referring Expressions

Attentional Push: A Deep Convolutional Network for Augmenting Image Salience with Shared Attention Modeling in Social Scenes

A Dataset and Exploration of Models for Understanding Video Data through Fill-in-the-Blank Question-Answering

Learning a Deep Embedding Model for Zero-Shot Learning

Hidden Layers in Perceptual Learning

A Domain Based Approach to Social Relation Recognition

Deep Quantization: Encoding Convolutional Activations with Deep Generative Model

MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network

Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning

What's in a Question: Using Visual Questions as a Form of Supervision

Top-Down Visual Saliency Guided by Captions

DeepPermNet: Visual Permutation Learning

Improving Interpretability of Deep Neural Networks with Semantic Information

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options