Search results

Items from 141 to 160 out of 2,083 results

1 ...
5
6
7
8
9
10
11

chapter

Few-Shot Object Recognition from Machine-Labeled Web Images

Zhongwen Xu, Linchao Zhu, Yi Yang

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5358 - 5366

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

With the tremendous advances made by Convolutional Neural Networks (ConvNets) on object recognition, we can now easily obtain adequately reliable machine-labeled annotations easily from predictions by off-the-shelf ConvNets. In this work, we present an abstraction memory based framework for few-shot learning, building upon machine-labeled image annotations. Our method takes large-scale machine-annotated...

chapter

Dense Captioning with Joint Inference and Visual Context

Linjie Yang, Kevin Tang, Jianchao Yang, Li-Jia Li

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1978 - 1987

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Dense captioning is a newly emerging computer vision topic for understanding images with dense language descriptions. The goal is to densely detect visual concepts (e.g., objects, object parts, and interactions between them) from images, labeling each with a short descriptive phrase. We identify two key challenges of dense captioning that need to be properly addressed when tackling the problem. First,...

chapter

Generative Hierarchical Learning of Sparse FRAME Models

Jianwen Xie, Yifei Xu, Erik Nijkamp, Ying Nian Wu, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1933 - 1941

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This paper proposes a method for generative learning of hierarchical random field models. The resulting model, which we call the hierarchical sparse FRAME (Filters, Random field, And Maximum Entropy) model, is a generalization of the original sparse FRAME model by decomposing it into multiple parts that are allowed to shift their locations, scales and rotations, so that the resulting model becomes...

chapter

Deep Unsupervised Similarity Learning Using Partially Ordered Sets

Miguel A. Bautista, Artsiom Sanakoyeu, Bjorn Ommer

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1923 - 1932

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Unsupervised learning of visual similarities is of paramount importance to computer vision, particularly due to lacking training data for fine-grained similarities. Deep learning of similarities is often based on relationships between pairs or triplets of samples. Many of these relations are unreliable and mutually contradicting, implying inconsistencies when trained without supervision information...

chapter

Weakly Supervised Affordance Detection

Johann Sawatzky, Abhilash Srikantha, Juergen Gall

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5197 - 5206

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Localizing functional regions of objects or affordances is an important aspect of scene understanding and relevant for many robotics applications. In this work, we introduce a pixel-wise annotated affordance dataset of 3090 images containing 9916 object instances. Since parts of an object can have multiple affordances, we address this by a convolutional neural network for multilabel affordance segmentation...

chapter

Weakly Supervised Dense Video Captioning

Zhiqiang Shen, Jianguo Li, Zhou Su, Minjun Li, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5159 - 5167

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This paper focuses on a novel and challenging vision task, dense video captioning, which aims to automatically describe a video clip with multiple informative and diverse caption sentences. The proposed method is trained without explicit annotation of fine-grained sentence to video region-sequence correspondence, but is only based on weak video-level sentence annotations. It differs from existing...

chapter

Joint Geometrical and Statistical Alignment for Visual Domain Adaptation

Jing Zhang, Wanqing Li, Philip Ogunbona

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5150 - 5158

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This paper presents a novel unsupervised domain adaptation method for cross-domain visual recognition. We propose a unified framework that reduces the shift between domains both statistically and geometrically, referred to as Joint Geometrical and Statistical Alignment (JGSA). Specifically, we learn two coupled projections that project the source domain and target domain data into low-dimensional...

chapter

Zero-Shot Classification with Discriminative Semantic Representation Learning

Meng Ye, Yuhong Guo

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5103 - 5111

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Zero-shot learning, a special case of unsupervised domain adaptation where the source and target domains have disjoint label spaces, has become increasingly popular in the computer vision community. In this paper, we propose a novel zero-shot learning method based on discriminative sparse non-negative matrix factorization. The proposed approach aims to identify a set of common high-level semantic...

chapter

Diversified Texture Synthesis with Feed-Forward Networks

Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 266 - 274

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Recent progresses on deep discriminative and generative modeling have shown promising results on texture synthesis. However, existing feed-forward based methods trade off generality for efficiency, which suffer from many issues, such as shortage of generality (i.e., build one network per texture), lack of diversity (i.e., always produce visually identical output) and suboptimality (i.e., generate...

chapter

Semantic Autoencoder for Zero-Shot Learning

Elyor Kodirov, Tao Xiang, Shaogang Gong

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4447 - 4456

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Existing zero-shot learning (ZSL) models typically learn a projection function from a feature space to a semantic embedding space (e.g. attribute space). However, such a projection function is only concerned with predicting the training seen class semantic representation (e.g. attribute prediction) or classification. When applied to test data, which in the context of ZSL contains different (unseen)...

chapter

Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks

Xiao Yang, Ersin Yumer, Paul Asente, Mike Kraley, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4342 - 4351

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We present an end-to-end, multimodal, fully convolutional network for extracting semantic structures from document images. We consider document semantic structure extraction as a pixel-wise segmentation task, and propose a unified model that classifies pixels based not only on their visual appearance, as in the traditional page segmentation task, but also on the content of underlying text. Moreover,...

chapter

SST: Single-Stream Temporal Action Proposals

Shyamal Buch, Victor Escorcia, Chuanqi Shen, Bernard Ghanem, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6373 - 6382

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Our paper presents a new approach for temporal detection of human actions in long, untrimmed video sequences. We introduce Single-Stream Temporal Action Proposals (SST), a new effective and efficient deep architecture for the generation of temporal action proposals. Our network can run continuously in a single stream over very long input video sequences, without the need to divide input into short...

chapter

Graph-Structured Representations for Visual Question Answering

Damien Teney, Lingqiao Liu, Anton van den Hengel

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3233 - 3241

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This paper proposes to improve visual question answering (VQA) with structured representations of both scene contents and questions. A key challenge in VQA is to require joint reasoning over the visual and text domains. The predominant CNN/LSTM-based approach to VQA is limited by monolithic vector representations that largely ignore structure in the scene and in the question. CNN feature vectors cannot...

chapter

Semantically Consistent Regularization for Zero-Shot Recognition

Pedro Morgado, Nuno Vasconcelos

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2037 - 2046

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

The role of semantics in zero-shot learning is considered. The effectiveness of previous approaches is analyzed according to the form of supervision provided. While some learn semantics independently, others only supervise the semantic subspace explained by training classes. Thus, the former is able to constrain the whole space but lacks the ability to model semantic correlations. The latter addresses...

chapter

Matrix Tri-Factorization with Manifold Regularizations for Zero-Shot Learning

Xing Xu, Fumin Shen, Yang Yang, Dongxiang Zhang, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2007 - 2016

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Zero-shot learning (ZSL) aims to recognize objects of unseen classes with available training data from another set of seen classes. Existing solutions are focused on exploring knowledge transfer via an intermediate semantic embedding (e.g., attributes) shared between seen and unseen classes. In this paper, we propose a novel projection framework based on matrix tri-factorization with manifold regularizations...

chapter

Learning Spatial Regularization with Image-Level Supervisions for Multi-label Image Classification

Feng Zhu, Hongsheng Li, Wanli Ouyang, Nenghai Yu, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2027 - 2036

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Multi-label image classification is a fundamental but challenging task in computer vision. Great progress has been achieved by exploiting semantic relations between labels in recent years. However, conventional approaches are unable to model the underlying spatial relations between labels in multi-label images, because spatial annotations of the labels are generally not provided. In this paper, we...

chapter

From Red Wine to Red Tomato: Composition with Context

Ishan Misra, Abhinav Gupta, Martial Hebert

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1160 - 1169

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Compositionality and contextuality are key building blocks of intelligence. They allow us to compose known concepts to generate new and complex ones. However, traditional learning methods do not model both these properties and require copious amounts of labeled data to learn new concepts. A large fraction of existing techniques, e.g., using late fusion, compose concepts but fail to model contextuality...

chapter

Deep Reinforcement Learning-Based Image Captioning with Embedding Reward

Zhou Ren, Xiaoyu Wang, Ning Zhang, Xutao Lv, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1151 - 1159

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Image captioning is a challenging problem owing to the complexity in understanding the image content and diverse ways of describing it in natural language. Recent advances in deep neural networks have substantially improved the performance of this task. Most state-of-the-art approaches follow an encoder-decoder framework, which generates captions using a sequential recurrent prediction model. However,...

chapter

Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension

Aniruddha Kembhavi, Minjoon Seo, Dustin Schwenk, Jonghyun Choi, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5376 - 5384

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We introduce the task of Multi-Modal Machine Comprehension (M3C), which aims at answering multimodal questions given a context of text, diagrams and images. We present the Textbook Question Answering (TQA) dataset that includes 1,076 lessons and 26,260 multi-modal questions, taken from middle school science curricula. Our analysis shows that a significant portion of questions require complex parsing...

chapter

Webly Supervised Semantic Segmentation

Bin Jin, Maria V. Ortiz Segovia, Sabine Susstrunk

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1705 - 1714

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We propose a weakly supervised semantic segmentation algorithm that uses image tags for supervision. We apply the tags in queries to collect three sets of web images, which encode the clean foregrounds, the common backgrounds, and realistic scenes of the classes. We introduce a novel three-stage training pipeline to progressively learn semantic segmentation models. We first train and refine a class-specific...

1 ...
5
6
7
8
9
10
11

Keywords:
VISUALIZATION
Publication type:
book

Publication date

Set your own date range

Content availability

Available (2,078)
None (5)

Keywords

FEATURE EXTRACTION (665)
SUPPORT VECTOR MACHINES (254)
COMPUTATIONAL MODELING (189)
SEMANTICS (185)
IMAGE COLOR ANALYSIS (175)
ACCURACY (157)
IMAGE CLASSIFICATION (147)
DATA MINING (134)
IMAGE SEGMENTATION (132)
HISTOGRAMS (121)
KERNEL (114)
OBJECT RECOGNITION (114)
LEARNING (ARTIFICIAL INTELLIGENCE) (113)
NEURAL NETWORKS (108)
COMPUTER VISION (107)
TESTING (104)
OBJECT DETECTION (103)
VECTORS (101)
IMAGE RECOGNITION (98)
DATABASES (97)
CAMERAS (96)
IMAGE RETRIEVAL (95)
CORRELATION (92)
DETECTORS (91)
SHAPE (88)
ROBOTS (87)
VOCABULARY (87)
ROBUSTNESS (86)
GAMES (85)
MACHINE LEARNING (84)
ELECTROENCEPHALOGRAPHY (82)
DICTIONARIES (80)
TRAINING DATA (77)
CONTEXT (73)
HAPTIC INTERFACES (73)
VIRTUAL REALITY (73)
HIDDEN MARKOV MODELS (71)
TARGET TRACKING (71)
FACE (69)
CLASSIFICATION ALGORITHMS (68)
THREE-DIMENSIONAL DISPLAYS (68)
HUMANS (64)
DATA MODELS (62)
SOLID MODELING (60)
MEASUREMENT (59)
NEURONS (58)
OPTIMIZATION (58)
TRAJECTORY (57)
ARTIFICIAL NEURAL NETWORKS (55)
IMAGE REPRESENTATION (55)
SPEECH (55)
ENCODING (54)
CONFERENCES (52)
DEEP LEARNING (51)
IMAGE EDGE DETECTION (51)
PREDICTIVE MODELS (50)
STANDARDS (50)
ESTIMATION (49)
FACE RECOGNITION (49)
VIDEOS (49)
EDUCATIONAL INSTITUTIONS (48)
PIXEL (46)
PRINCIPAL COMPONENT ANALYSIS (46)
FORCE (45)
MATHEMATICAL MODEL (45)
ADAPTATION MODELS (44)
COMPUTER ARCHITECTURE (43)
DATA VISUALIZATION (42)
CLUSTERING ALGORITHMS (41)
IMAGE RECONSTRUCTION (40)
JOINTS (39)
NAVIGATION (39)
COMPUTERS (38)
DATA VISUALISATION (38)
MULTIMEDIA COMMUNICATION (38)
CONVOLUTION (37)
PROTOTYPES (36)
ROBOT SENSING SYSTEMS (36)
INTERNET (35)
PATTERN RECOGNITION (35)
SOFTWARE (35)
VIDEO SIGNAL PROCESSING (35)
SPEECH RECOGNITION (34)
PATTERN CLASSIFICATION (32)
PSYCHOLOGY (32)
BUILDINGS (31)
CLASSIFICATION (31)
IMAGE CODING (31)
LABELING (31)
NEURAL NETS (31)
THREE DIMENSIONAL DISPLAYS (31)
VEHICLES (31)
ALGORITHM DESIGN AND ANALYSIS (29)
CONTENT-BASED RETRIEVAL (29)
ELECTRODES (29)
IMAGE RESOLUTION (29)
LEGGED LOCOMOTION (29)
NOISE MEASUREMENT (29)
more

Data set

ieee (2,082)
Springer (1)

INFONA - science communication portal

Search results

Few-Shot Object Recognition from Machine-Labeled Web Images

Dense Captioning with Joint Inference and Visual Context

Generative Hierarchical Learning of Sparse FRAME Models

Deep Unsupervised Similarity Learning Using Partially Ordered Sets

Weakly Supervised Affordance Detection

Weakly Supervised Dense Video Captioning

Joint Geometrical and Statistical Alignment for Visual Domain Adaptation

Zero-Shot Classification with Discriminative Semantic Representation Learning

Diversified Texture Synthesis with Feed-Forward Networks

Semantic Autoencoder for Zero-Shot Learning

Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks

SST: Single-Stream Temporal Action Proposals

Graph-Structured Representations for Visual Question Answering

Semantically Consistent Regularization for Zero-Shot Recognition

Matrix Tri-Factorization with Manifold Regularizations for Zero-Shot Learning

Learning Spatial Regularization with Image-Level Supervisions for Multi-label Image Classification

From Red Wine to Red Tomato: Composition with Context

Deep Reinforcement Learning-Based Image Captioning with Embedding Reward

Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension

Webly Supervised Semantic Segmentation

Filter options

Publication date

Content availability

Keywords

Data set

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Data set

Reporting an error / abuse

Sending the report failed

Accessibility options