Search results

chapter

Stairs recognition using stereo vision-based algorithm in NAO robot

Daniel Aguilera-Castro, Manuel Neira-Carcamo, Cristhian Aguilera-Carrasco, Luis Vera-Quiroga

2017 CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON) > 1 - 6

2017 CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON)

In this paper we do staircase detection with a stereo vision based algorithm through NAO robot, using one of this cameras. Robot programming was implemented in Python language using ROS software. The detection algorithm is divided in two parts: line detection and depth perception. In line detection process we use Hough transform and vanishing point criteria for line segmentation. Respecting depth...

chapter

Dynamic gaze analysis: An application enviroment for face-to-face communication

Ulku Arslan Aydin, Sinan Kalkan, Cengiz Acarturk

2017 International Artificial Intelligence and Data Processing Symposium (IDAP) > 1 - 6

2017 International Artificial Intelligence and Data Processing Symposium (IDAP)

Gaze analysis in dynamic environments has remained an unresolved problem due to the complexities that pertain to the detection and tracking of objects in the visual environment. This study provides a solution to the problem for face-to-face communication, in which the visual objects in the environment are faces. The application that has been developed for this purpose is able to detect and track faces...

chapter

Visual saliency detection based on region contrast and guided filter

Liqiang Liu, Jianzhong Cao, Yuefeng Niu, Huinan Guo

2017 2nd IEEE International Conference on Computational Intelligence and Applications (ICCIA) > 327 - 330

2017 2nd IEEE International Conference on Computational Intelligence and Applications (ICCIA)

The main challenge of previous saliency detection method is the low quality of obtained saliency map which missed the edge and texture information easily. So it cannot reflect the integrated image salient information. Considering this problem, we propose a novel saliency measure method which combine region contrast and fast guided filter. This method utilizes region contrast method to obtain initial...

chapter

Fabric defect detection based on visual saliency map and SVM

Hao Zhang, Jiajuan Hu, Zhiyong He

2017 2nd IEEE International Conference on Computational Intelligence and Applications (ICCIA) > 322 - 326

2017 2nd IEEE International Conference on Computational Intelligence and Applications (ICCIA)

To improve the accuracy of surface defect detection, an approach of defect inspection based on visual saliency map and Support Vector Machine(SVM) is proposed. Monochrome fabric defect images are taken as examples in this paper. By analyzing the visual saliency maps of these images, the global associated value and the background associated value are extracted as the two features. After being normalized,...

chapter

Unsupervised segmentation of action segments in egocentric videos using gaze

Hipiny, H. Ujir, J.L. Minoi, S.F. Samson Juan, more

2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA) > 351 - 356

2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)

Unsupervised segmentation of action segments in egocentric videos is a desirable feature in tasks such as activity recognition and content-based video retrieval. Reducing the search space into a finite set of action segments facilitates a faster and less noisy matching. However, there exist a substantial gap in machine's understanding of natural temporal cuts during a continuous human activity. This...

chapter

An approach for environment mapping and control of wall follower cellbot through monocular vision and fuzzy system

Karoline de M. Farias, R. Wilson Leal, Ranulfo P. Bezerra Neto, Ricardo A. L. Rabelo, more

2017 XLIII Latin American Computer Conference (CLEI) > 1 - 8

2017 XLIII Latin American Computer Conference (CLEI)

This paper presents an approach using range measurement through homography calculation to build 2D visual occupancy grid and control the robot through monocular vision. This approach is designed for a Cellbot architecture. The robot is equipped with wall following behavior to explore the environment, which enables the robot to trail objects contours, residing in the fuzzy control the responsibility...

chapter

Semantic Amodal Segmentation

Yan Zhu, Yuandong Tian, Dimitris Metaxas, Piotr Dollar

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3001 - 3009

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Common visual recognition tasks such as classification, object detection, and semantic segmentation are rapidly reaching maturity, and given the recent rate of progress, it is not unreasonable to conjecture that techniques for many of these problems will approach human levels of performance in the next few years. In this paper we look to the future: what is the next frontier in visual recognition?...

chapter

Weakly Supervised Affordance Detection

Johann Sawatzky, Abhilash Srikantha, Juergen Gall

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5197 - 5206

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Localizing functional regions of objects or affordances is an important aspect of scene understanding and relevant for many robotics applications. In this work, we introduce a pixel-wise annotated affordance dataset of 3090 images containing 9916 object instances. Since parts of an object can have multiple affordances, we address this by a convolutional neural network for multilabel affordance segmentation...

chapter

Enhancing Video Summarization via Vision-Language Embedding

Bryan A. Plummer, Matthew Brown, Svetlana Lazebnik

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1052 - 1060

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This paper addresses video summarization, or the problem of distilling a raw video into a shorter form while still capturing the original story. We show that visual representations supervised by freeform language make a good fit for this application by extending a recent submodular summarization approach [9] with representativeness and interestingness objectives computed on features from a joint vision-language...

chapter

Scene Parsing through ADE20K Dataset

Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5122 - 5130

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Scene parsing, or recognizing and segmenting objects and stuff in an image, is one of the key problems in computer vision. Despite the communitys efforts in data collection, there are still few image datasets covering a wide range of scenes and object categories with dense and detailed annotations for scene parsing. In this paper, we introduce and analyze the ADE20K dataset, spanning diverse annotations...

chapter

Deep Level Sets for Salient Object Detection

Ping Hu, Bing Shuai, Jun Liu, Gang Wang

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 540 - 549

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Deep learning has been applied to saliency detection in recent years. The superior performance has proved that deep networks can model the semantic properties of salient objects. Yet it is difficult for a deep network to discriminate pixels belonging to similar receptive fields around the object boundaries, thus deep networks may output maps with blurred saliency and inaccurate boundaries. To tackle...

chapter

Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks

Xiao Yang, Ersin Yumer, Paul Asente, Mike Kraley, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4342 - 4351

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We present an end-to-end, multimodal, fully convolutional network for extracting semantic structures from document images. We consider document semantic structure extraction as a pixel-wise segmentation task, and propose a unified model that classifies pixels based not only on their visual appearance, as in the traditional page segmentation task, but also on the content of underlying text. Moreover,...

chapter

The Amazing Mysteries of the Gutter: Drawing Inferences Between Panels in Comic Book Narratives

Mohit Iyyer, Varun Manjunatha, Anupam Guha, Yogarshi Vyas, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6478 - 6487

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Visual narrative is often a combination of explicit information and judicious omissions, relying on the viewer to supply missing details. In comics, most movements in time and space are hidden in the gutters between panels. To follow the story, readers logically connect panels together by inferring unseen actions through a process called closure. While computers can now describe the content of natural...

chapter

Weakly-Supervised Visual Grounding of Phrases with Linguistic Structures

Fanyi Xiao, Leonid Sigal, Yong Jae Lee

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5253 - 5262

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We propose a weakly-supervised approach that takes image-sentence pairs as input and learns to visually ground (i.e., localize) arbitrary linguistic phrases, in the form of spatial attention masks. Specifically, the model is trained with images and their associated image-level captions, without any explicit region-to-phrase correspondence annotations. To this end, we introduce an end-to-end model...

chapter

Webly Supervised Semantic Segmentation

Bin Jin, Maria V. Ortiz Segovia, Sabine Susstrunk

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1705 - 1714

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We propose a weakly supervised semantic segmentation algorithm that uses image tags for supervision. We apply the tags in queries to collect three sets of web images, which encode the clean foregrounds, the common backgrounds, and realistic scenes of the classes. We introduce a novel three-stage training pipeline to progressively learn semantic segmentation models. We first train and refine a class-specific...

chapter

Locality-Sensitive Deconvolution Networks with Gated Fusion for RGB-D Indoor Semantic Segmentation

Yanhua Cheng, Rui Cai, Zhiwei Li, Xin Zhao, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1475 - 1483

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This paper focuses on indoor semantic segmentation using RGB-D data. Although the commonly used deconvolution networks (DeconvNet) have achieved impressive results on this task, we find there is still room for improvements in two aspects. One is about the boundary segmentation. DeconvNet aggregates large context to predict the label of each pixel, inherently limiting the segmentation precision of...

chapter

Predicting Ground-Level Scene Layout from Aerial Imagery

Menghua Zhai, Zachary Bessinger, Scott Workman, Nathan Jacobs

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4132 - 4140

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We introduce a novel strategy for learning to extract semantically meaningful features from aerial imagery. Instead of manually labeling the aerial imagery, we propose to predict (noisy) semantic features automatically extracted from co-located ground imagery. Our network architecture takes an aerial image as input, extracts features using a convolutional neural network, and then applies an adaptive...

chapter

YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video

Esteban Real, Jonathon Shlens, Stefano Mazzocchi, Xin Pan, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 7464 - 7473

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We introduce a new large-scale data set of video URLs with densely-sampled object bounding box annotations called YouTube-BoundingBoxes (YT-BB). The data set consists of approximately 380,000 video segments about 19s long, automatically selected to feature objects in natural settings without editing or post-processing, with a recording quality often akin to that of a hand-held cell phone camera. The...

chapter

Network Dissection: Quantifying Interpretability of Deep Visual Representations

David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3319 - 3327

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We propose a general framework called Network Dissection for quantifying the interpretability of latent representations of CNNs by evaluating the alignment between individual hidden units and a set of semantic concepts. Given any CNN model, the proposed method draws on a data set of concepts to score the semantics of hidden units at each intermediate convolutional layer. The units with semantics are...

chapter

Hidden Layers in Perceptual Learning

Gad Cohen, Daphna Weinshall

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5349 - 5357

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Studies in visual perceptual learning investigate the way human performance improves with practice, in the context of relatively simple (and therefore more manageable) visual tasks. Building on the powerful tools currently available for the training of Convolution Neural Networks (CNN), networks whose original architecture was inspired by the visual system, we revisited some of the open computational...

INFONA - science communication portal

Search results

Stairs recognition using stereo vision-based algorithm in NAO robot

Dynamic gaze analysis: An application enviroment for face-to-face communication

Visual saliency detection based on region contrast and guided filter

Fabric defect detection based on visual saliency map and SVM

Unsupervised segmentation of action segments in egocentric videos using gaze

An approach for environment mapping and control of wall follower cellbot through monocular vision and fuzzy system

Semantic Amodal Segmentation

Weakly Supervised Affordance Detection

Enhancing Video Summarization via Vision-Language Embedding

Scene Parsing through ADE20K Dataset

Deep Level Sets for Salient Object Detection

Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks

The Amazing Mysteries of the Gutter: Drawing Inferences Between Panels in Comic Book Narratives

Weakly-Supervised Visual Grounding of Phrases with Linguistic Structures

Webly Supervised Semantic Segmentation

Locality-Sensitive Deconvolution Networks with Gated Fusion for RGB-D Indoor Semantic Segmentation

Predicting Ground-Level Scene Layout from Aerial Imagery

YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video

Network Dissection: Quantifying Interpretability of Deep Visual Representations

Hidden Layers in Perceptual Learning

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options