Search results

Items from 121 to 140 out of 1,266 results

1 ...
4
5
6
7
8
9
10

chapter

End-to-End Concept Word Detection for Video Captioning, Retrieval, and Question Answering

Youngjae Yu, Hyungjin Ko, Jongwook Choi, Gunhee Kim

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3261 - 3269

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We propose a high-level concept word detector that can be integrated with any video-to-language models. It takes a video as input and generates a list of concept words as useful semantic priors for language generation models. The proposed word detector has two important properties. First, it does not require any external knowledge sources for training. Second, the proposed word detector is trainable...

chapter

Learning Cross-Modal Embeddings for Cooking Recipes and Food Images

Amaia Salvador, Nicholas Hynes, Yusuf Aytar, Javier Marin, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3068 - 3076

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

In this paper, we introduce Recipe1M, a new large-scale, structured corpus of over 1m cooking recipes and 800k food images. As the largest publicly available collection of recipe data, Recipe1M affords the ability to train high-capacity models on aligned, multi-modal data. Using these data, we train a neural network to find a joint embedding of recipes and images that yields impressive results on...

chapter

Learning Random-Walk Label Propagation for Weakly-Supervised Semantic Segmentation

Paul Vernaza, Manmohan Chandraker

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2953 - 2961

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Large-scale training for semantic segmentation is challenging due to the expense of obtaining training data for this task relative to other vision tasks. We propose a novel training approach to address this difficulty. Given cheaply-obtained sparse image labelings, we propagate the sparse labels to produce guessed dense labelings. A standard CNN-based segmentation network is trained to mimic these...

chapter

Deep Image Harmonization

Yi-Hsuan Tsai, Xiaohui Shen, Zhe Lin, Kalyan Sunkavalli, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2799 - 2807

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Compositing is one of the most common operations in photo editing. To generate realistic composites, the appearances of foreground and background need to be adjusted to make them compatible. Previous approaches to harmonize composites have focused on learning statistical relationships between hand-crafted appearance features of the foreground and background, which is unreliable especially when the...

chapter

Self-Supervised Learning of Visual Features through Embedding Images into Text Topic Spaces

Lluis Gomez, Yash Patel, Marcal Rusinol, Dimosthenis Karatzas, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2017 - 2026

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

End-to-end training from scratch of current deep architectures for new computer vision problems would require Imagenet-scale datasets, and this is not always possible. In this paper we present a method that is able to take advantage of freely available multi-modal content to train computer vision algorithms without human supervision. We put forward the idea of performing self-supervised learning of...

chapter

Zero Shot Learning via Multi-scale Manifold Regularization

Shay Deutsch, Soheil Kolouri, Kyungnam Kim, Yuri Owechko, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5292 - 5299

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We address zero-shot learning using a new manifold alignment framework based on a localized multi-scale transform on graphs. Our inference approach includes a smoothness criterion for a function mapping nodes on a graph (visual representation) onto a linear space (semantic representation), which we optimize using multi-scale graph wavelets. The robustness of the ensuing scheme allows us to operate...

chapter

Dense Captioning with Joint Inference and Visual Context

Linjie Yang, Kevin Tang, Jianchao Yang, Li-Jia Li

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1978 - 1987

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Dense captioning is a newly emerging computer vision topic for understanding images with dense language descriptions. The goal is to densely detect visual concepts (e.g., objects, object parts, and interactions between them) from images, labeling each with a short descriptive phrase. We identify two key challenges of dense captioning that need to be properly addressed when tackling the problem. First,...

chapter

Weakly Supervised Affordance Detection

Johann Sawatzky, Abhilash Srikantha, Juergen Gall

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5197 - 5206

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Localizing functional regions of objects or affordances is an important aspect of scene understanding and relevant for many robotics applications. In this work, we introduce a pixel-wise annotated affordance dataset of 3090 images containing 9916 object instances. Since parts of an object can have multiple affordances, we address this by a convolutional neural network for multilabel affordance segmentation...

chapter

Semantic Segmentation via Structured Patch Prediction, Context CRF and Guidance CRF

Falong Shen, Rui Gan, Shuicheng Yan, Gang Zeng

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5178 - 5186

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This paper describes a fast and accurate semantic image segmentation approach that encodes not only segmentation-specified features but also high-order context compatibilities and boundary guidance constraints. We introduce a structured patch prediction technique to make a trade-off between classification discriminability and boundary sensibility for features. Both label and feature contexts are embedded...

chapter

RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation

Guosheng Lin, Anton Milan, Chunhua Shen, Ian Reid

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5168 - 5177

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Recently, very deep convolutional neural networks (CNNs) have shown outstanding performance in object recognition and have also been the first choice for dense classification problems such as semantic segmentation. However, repeated subsampling operations like pooling or convolution striding in deep CNNs lead to a significant decrease in the initial image resolution. Here, we present RefineNet, a...

chapter

Zero-Shot Classification with Discriminative Semantic Representation Learning

Meng Ye, Yuhong Guo

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5103 - 5111

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Zero-shot learning, a special case of unsupervised domain adaptation where the source and target domains have disjoint label spaces, has become increasingly popular in the computer vision community. In this paper, we propose a novel zero-shot learning method based on discriminative sparse non-negative matrix factorization. The proposed approach aims to identify a set of common high-level semantic...

chapter

Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks

Yinda Zhang, Shuran Song, Ersin Yumer, Manolis Savva, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5057 - 5065

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Indoor scene understanding is central to applications such as robot navigation and human companion assistance. Over the last years, data-driven deep neural networks have outperformed many traditional approaches thanks to their representation learning capabilities. One of the bottlenecks in training for better representations is the amount of available per-pixel ground truth data that is required for...

chapter

Deep Multitask Architecture for Integrated 2D and 3D Human Sensing

Alin-Ionut Popa, Mihai Zanfir, Cristian Sminchisescu

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4714 - 4723

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We propose a deep multitask architecture for fully automatic 2d and 3d human sensing (DMHS), including recognition and reconstruction, in monocular images. The system computes the figure-ground segmentation, semantically identifies the human body parts at pixel level, and estimates the 2d and 3d pose of the person. The model supports the joint training of all components by means of multi-task losses...

chapter

FC^4: Fully Convolutional Color Constancy with Confidence-Weighted Pooling

Yuanming Hu, Baoyuan Wang, Stephen Lin

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 330 - 339

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Improvements in color constancy have arisen from the use of convolutional neural networks (CNNs). However, the patch-based CNNs that exist for this problem are faced with the issue of estimation ambiguity, where a patch may contain insufficient information to establish a unique or even a limited possible range of illumination colors. Image patches with estimation ambiguity not only appear with great...

chapter

SRN: Side-Output Residual Network for Object Symmetry Detection in the Wild

Wei Ke, Jie Chen, Jianbin Jiao, Guoying Zhao, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 302 - 310

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

In this paper, we establish a baseline for object symmetry detection in complex backgrounds by presenting a new benchmark and an end-to-end deep learning approach, opening up a promising direction for symmetry detection in the wild. The new benchmark, named Sym-PASCAL, spans challenges including object diversity, multi-objects, part-invisibility, and various complex backgrounds that are far beyond...

chapter

CityPersons: A Diverse Dataset for Pedestrian Detection

Shanshan Zhang, Rodrigo Benenson, Bernt Schiele

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4457 - 4465

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Convnets have enabled significant progress in pedestrian detection recently, but there are still open questions regarding suitable architectures and training data. We revisit CNN design and point out key adaptations, enabling plain FasterRCNN to obtain state-of-the-art results on the Caltech dataset. To achieve further improvement from more and better data, we introduce CityPersons, a new set of person...

chapter

Semantic Autoencoder for Zero-Shot Learning

Elyor Kodirov, Tao Xiang, Shaogang Gong

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4447 - 4456

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Existing zero-shot learning (ZSL) models typically learn a projection function from a feature space to a semantic embedding space (e.g. attribute space). However, such a projection function is only concerned with predicting the training seen class semantic representation (e.g. attribute prediction) or classification. When applied to test data, which in the context of ZSL contains different (unseen)...

chapter

Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks

Xiao Yang, Ersin Yumer, Paul Asente, Mike Kraley, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4342 - 4351

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We present an end-to-end, multimodal, fully convolutional network for extracting semantic structures from document images. We consider document semantic structure extraction as a pixel-wise segmentation task, and propose a unified model that classifies pixels based not only on their visual appearance, as in the traditional page segmentation task, but also on the content of underlying text. Moreover,...

chapter

Semantic Image Inpainting with Deep Generative Models

Raymond A. Yeh, Chen Chen, Teck Yian Lim, Alexander G. Schwing, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6882 - 6890

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Semantic image inpainting is a challenging task where large missing regions have to be filled based on the available visual data. Existing methods which extract information from only a single image generally produce unsatisfactory results due to the lack of high level context. In this paper, we propose a novel method for semantic image inpainting, which generates the missing content by conditioning...

chapter

Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach

Yunchao Wei, Jiashi Feng, Xiaodan Liang, Ming-Ming Cheng, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6488 - 6496

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We investigate a principle way to progressively mine discriminative object regions using classification networks to address the weakly-supervised semantic segmentation problems. Classification networks are only responsive to small and sparse discriminative regions from the object of interest, which deviates from the requirement of the segmentation task that needs to localize dense, interior and integral...

1 ...
4
5
6
7
8
9
10

Data set:
ieee
Keywords:
TRAINING
SEMANTICS

Publication date

Set your own date range

Content availability

Available (1,259)
None (7)

Publication type

book (1,054)
article (212)

Keywords

FEATURE EXTRACTION (353)
VISUALIZATION (248)
IMAGE SEGMENTATION (151)
SUPPORT VECTOR MACHINES (137)
CONTEXT (125)
ACCURACY (117)
COMPUTATIONAL MODELING (115)
NEURAL NETWORKS (106)
VECTORS (102)
MACHINE LEARNING (90)
CORRELATION (89)
NATURAL LANGUAGE PROCESSING (80)
LABELING (78)
DATA MINING (76)
TRAINING DATA (73)
TESTING (71)
IMAGE RETRIEVAL (69)
SYNTACTICS (64)
IMAGE COLOR ANALYSIS (62)
KERNEL (62)
DATA MODELS (61)
HIDDEN MARKOV MODELS (60)
ONTOLOGIES (60)
CLASSIFICATION ALGORITHMS (59)
DICTIONARIES (58)
STANDARDS (54)
DATABASES (52)
PREDICTIVE MODELS (50)
IMAGE CLASSIFICATION (46)
DETECTORS (45)
DEEP LEARNING (44)
MATHEMATICAL MODEL (44)
TAGGING (44)
MEASUREMENT (43)
LEARNING (ARTIFICIAL INTELLIGENCE) (42)
OPTIMIZATION (39)
INTERNET (37)
TEXT ANALYSIS (37)
COMPUTER VISION (36)
CONTEXT MODELING (36)
ALGORITHM DESIGN AND ANALYSIS (35)
PROBABILISTIC LOGIC (34)
ADAPTATION MODELS (33)
COMPUTER ARCHITECTURE (33)
MULTIMEDIA COMMUNICATION (33)
SPEECH (33)
EDUCATIONAL INSTITUTIONS (32)
IMAGE ANNOTATION (32)
INFORMATION RETRIEVAL (32)
ENCODING (31)
OBJECT DETECTION (31)
VOCABULARY (31)
TEXT CATEGORIZATION (30)
CONVOLUTION (29)
HISTOGRAMS (29)
SENTIMENT ANALYSIS (29)
REMOTE SENSING (27)
SUPPORT VECTOR MACHINE CLASSIFICATION (27)
IMAGE EDGE DETECTION (26)
SPEECH RECOGNITION (26)
BUILDINGS (25)
IMAGE RECOGNITION (25)
THREE-DIMENSIONAL DISPLAYS (25)
CONFERENCES (24)
DECODING (24)
ESTIMATION (24)
ROBUSTNESS (24)
BENCHMARK TESTING (23)
HUMANS (23)
NEURONS (23)
NOISE MEASUREMENT (23)
PRAGMATICS (23)
PROBABILITY (23)
CLUSTERING ALGORITHMS (22)
KNOWLEDGE BASED SYSTEMS (22)
PATTERN CLASSIFICATION (22)
SHAPE (22)
SUPERVISED LEARNING (22)
ANALYTICAL MODELS (21)
CAMERAS (21)
MATRIX DECOMPOSITION (21)
PROPOSALS (21)
ROBOTS (21)
STREAMING MEDIA (21)
VIDEOS (21)
COMPUTERS (20)
ENTROPY (20)
INDEXES (20)
LINEAR PROGRAMMING (20)
ROADS (20)
ARTIFICIAL NEURAL NETWORKS (19)
IMAGE RECONSTRUCTION (19)
TEXT CLASSIFICATION (19)
CONTENT-BASED RETRIEVAL (18)
LEARNING SYSTEMS (18)
MEDIA (18)
SEARCH ENGINES (18)
ELECTRONIC MAIL (17)
more

INFONA - science communication portal

Search results

End-to-End Concept Word Detection for Video Captioning, Retrieval, and Question Answering

Learning Cross-Modal Embeddings for Cooking Recipes and Food Images

Learning Random-Walk Label Propagation for Weakly-Supervised Semantic Segmentation

Deep Image Harmonization

Self-Supervised Learning of Visual Features through Embedding Images into Text Topic Spaces

Zero Shot Learning via Multi-scale Manifold Regularization

Dense Captioning with Joint Inference and Visual Context

Weakly Supervised Affordance Detection

Semantic Segmentation via Structured Patch Prediction, Context CRF and Guidance CRF

RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation

Zero-Shot Classification with Discriminative Semantic Representation Learning

Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks

Deep Multitask Architecture for Integrated 2D and 3D Human Sensing

FC^4: Fully Convolutional Color Constancy with Confidence-Weighted Pooling

SRN: Side-Output Residual Network for Object Symmetry Detection in the Wild

CityPersons: A Diverse Dataset for Pedestrian Detection

Semantic Autoencoder for Zero-Shot Learning

Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks

Semantic Image Inpainting with Deep Generative Models

Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options