Search results

chapter

Referring Expression Generation and Comprehension via Attributes

Jingyu Liu, Liang Wang, Ming-Hsuan Yang

2017 IEEE International Conference on Computer Vision (ICCV) > 4866 - 4874

2017 IEEE International Conference on Computer Vision (ICCV)

Referring expression is a kind of language expression that used for referring to particular objects. To make the expression without ambiguation, people often use attributes to describe the particular object. In this paper, we explore the role of attributes by incorporating them into both referring expression generation and comprehension. We first train an attribute learning model from visual objects...

chapter

A saliency detection model combined local and global features

Pin Wang, Guohui Tian, Huanzhao Chen

2017 Chinese Automation Congress (CAC) > 2863 - 2870

2017 Chinese Automation Congress (CAC)

Most present methods of saliency detection emphasize too much on the local contrast while ignore the global feature of image. The detailed characteristics of the image can be reflected based on the local comparison of image. However, the overall saliency of the image cannot be reflected. In this paper, a saliency detection model combined local and global features was proposed. Firstly, a local feature...

chapter

Saliency prediction with scene structural guidance

Haoran Liang, Ming Jiang, Ronghua Liang, Qi Zhao

2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC) > 3483 - 3488

2017 IEEE International Conference on Systems, Man and Cybernetics (SMC)

Previous works have suggested the role of scene information in directing gaze. The structure of a scene provides global contextual information that complements local object information in saliency prediction. In this study, we explore how scene envelopes such as openness, depth, and perspective affect visual attention in natural outdoor images. To facilitate this study, an eye tracking dataset is...

chapter

Deep Spatial-Semantic Attention for Fine-Grained Sketch-Based Image Retrieval

Jifei Song, Qian Yu, Yi-Zhe Song, Tao Xiang, more

2017 IEEE International Conference on Computer Vision (ICCV) > 5552 - 5561

2017 IEEE International Conference on Computer Vision (ICCV)

Human sketches are unique in being able to capture both the spatial topology of a visual object, as well as its subtle appearance details. Fine-grained sketch-based image retrieval (FG-SBIR) importantly leverages on such fine-grained characteristics of sketches to conduct instance-level retrieval of photos. Nevertheless, human sketches are often highly abstract and iconic, resulting in severe misalignments...

chapter

Object retrieval in past video using bag-of-words model

Manh-Tien Nguyen-Hoang, Tu-Khiem Le, Van-Tu Ninh, Quoc-Huu Che, more

2017 International Conference on Control, Automation and Information Sciences (ICCAIS) > 145 - 150

2017 International Conference on Control, Automation and Information Sciences (ICCAIS)

Together with the technology advancement, Computer Vision plays an important role in enhancing smart computing systems to help people overcome obstacles in their daily lives. One of the common troublesome problems is human memorization ability, especially memorizing things such as personal items. It is annoying for people to waste their time finding lost items manually by recall or notes. This motivates...

chapter

Collaborative visual navigation for UAVs in blurry environment

Xiaodong Li, Tong Chen

2017 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC) > 1 - 6

2017 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC)

This paper presents a novel strategy addressing visual SLAM with enhancement of data association method. Hyper graph theory and transformation was incorporated within cooperative visual SLAM. The research presented a synthetic approach to fulfill a cooperative data association and fusion strategy for multiple UAVs equipped with stereo vision cameras encountered with indistinct imaging, where conventional...

chapter

Multi-scale Deep Learning Architectures for Person Re-identification

Xuelin Qian, Yanwei Fu, Yu-Gang Jiang, Tao Xiang, more

2017 IEEE International Conference on Computer Vision (ICCV) > 5409 - 5418

2017 IEEE International Conference on Computer Vision (ICCV)

Person Re-identification (re-id) aims to match people across non-overlapping camera views in a public space. It is a challenging problem because many people captured in surveillance videos wear similar clothes. Consequently, the differences in their appearance are often subtle and only detectable at the right location and scales. Existing re-id models, particularly the recently proposed deep learning...

chapter

Wavelet-based texture model for crowd dynamic analysis

Jing Wang, Zhijie Xu, Yanlong Cao, Yuanping Xu

2017 23rd International Conference on Automation and Computing (ICAC) > 1 - 5

2017 23rd International Conference on Automation and Computing (ICAC)

Crowd event detection techniques aim at solving real-world surveillance problems, such as detecting crowd anomaly and tracking specific person in a highly dynamic crowd scene. In this paper, we proposed an innovate texture-based analysis method to model crowd dynamics and us it to distinguish the crowd behaviours. To describe complicated crowd scenes, homogeneous random features have been deployed...

chapter

Loop closure detection for visual SLAM systems using convolutional neural network

Xiwu Zhang, Yan Su, Xinhua Zhu

2017 23rd International Conference on Automation and Computing (ICAC) > 1 - 6

2017 23rd International Conference on Automation and Computing (ICAC)

This paper is concerned of the loop closure detection problem, which is one of the most critical parts for visual Simultaneous Localization and Mapping (SLAM) systems. Most of state-of-the-art methods use hand-crafted features and bag-of-visual-words (BoVW) to tackle this problem. Recent development in deep learning indicates that CNN features significantly outperform hand-crafted features for image...

chapter

A semiautomatic saliency model and its application to video compression

Vitaliy Lyudvichenko, Mikhail Erofeev, Yury Gitman, Dmitriy Vatolin

2017 13th IEEE International Conference on Intelligent Computer Communication and Processing (ICCP) > 403 - 410

2017 13th IEEE International Conference on Intelligent Computer Communication and Processing (ICCP)

This work aims to apply visual-attention modeling to attention-based video compression. During our comparison we found that eye-tracking data collected even from a single observer outperforms existing automatic models by a significant margin. Therefore, we offer a semiautomatic approach: using computer-vision algorithms and good initial estimation of eye-tracking data from just one observer to produce...

chapter

Visual Translation Embedding Network for Visual Relation Detection

Hanwang Zhang, Zawlin Kyaw, Shih-Fu Chang, Tat-Seng Chua

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3107 - 3115

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Visual relations, such as person ride bike and bike next to car, offer a comprehensive scene understanding of an image, and have already shown their great utility in connecting computer vision and natural language. However, due to the challenging combinatorial complexity of modeling subject-predicate-object relation triplets, very little work has been done to localize and predict visual relations...

chapter

Attentional Push: A Deep Convolutional Network for Augmenting Image Salience with Shared Attention Modeling in Social Scenes

Siavash Gorji, James J. Clark

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3472 - 3481

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We present a novel visual attention tracking technique based on Shared Attention modeling. By considering the viewer as a participant in the activity occurring in the scene, our model learns the loci of attention of the scene actors and use it to augment image salience. We go beyond image salience and instead of only computing the power of image regions to pull attention, we also consider the strength...

chapter

Improving Interpretability of Deep Neural Networks with Semantic Information

Yinpeng Dong, Hang Su, Jun Zhu, Bo Zhang

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 975 - 983

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Interpretability of deep neural networks (DNNs) is essential since it enables users to understand the overall strengths and weaknesses of the models, conveys an understanding of how the models will behave in the future, and how to diagnose and correct potential problems. However, it is challenging to reason about what a DNN actually does due to its opaque or black-box nature. To address this issue,...

chapter

Deep learning for multimodal-based video interestingness prediction

Yuesong Shen, Claire-Heiene Demarty, Ngoc Q. K. Duong

2017 IEEE International Conference on Multimedia and Expo (ICME) > 1003 - 1008

2017 IEEE International Conference on Multimedia and Expo (ICME)

Predicting interestingness of media content remains an important, but challenging research subject. The difficulty comes first from the fact that, besides being a high-level semantic concept, interestingness is highly subjective and its global definition has not been agreed yet. This paper presents the use of up-to-date deep learning techniques for solving the task. We perform experiments with both...

chapter

Fine-grained image recognition via weakly supervised click data guided bilinear CNN model

Guangjian Zheng, Min Tan, Jun Yu, Qing Wu, more

2017 IEEE International Conference on Multimedia and Expo (ICME) > 661 - 666

2017 IEEE International Conference on Multimedia and Expo (ICME)

Bilinear convolutional neural networks (BCNN) model, the state-of-the-art in fine-grained image recognition, fails in distinguishing the categories with subtle visual differences. We design a novel BCNN model guided by user click data (C-BCNN) to improve the performance via capturing both the visual and semantical content in images. Specially, to deal with the heavy noise in large-scale click data,...

chapter

Recurrent Memory Addressing for Describing Videos

Arnav Kumar Jain, Abhinav Agarwalla, Kumar Krishna Agrawal, Pabitra Mitra

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2200 - 2207

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

In this paper, we introduce Key-Value Memory Networks to a multimodal setting and a novel key-addressing mechanism to deal with sequence-to-sequence models. The proposed model naturally decomposes the problem of video captioning into vision and language segments, dealing with them as key-value pairs. More specifically, we learn a semantic embedding (v) corresponding to each frame (k) in the video,...

chapter

SANet: Structure-Aware Network for Visual Tracking

Heng Fan, Haibin Ling

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2217 - 2224

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Convolutional neural network (CNN) has drawn increasing interest in visual tracking owing to its powerfulness in feature extraction. Most existing CNN-based trackers treat tracking as a classification problem. However, these trackers are sensitive to similar distractors because their CNN models mainly focus on inter-class classification. To address this problem, we use self-structure information of...

chapter

Learning attentional recurrent neural network for visual tracking

Qiurui Wang, Chun Yuan, Zhihui Lin

2017 IEEE International Conference on Multimedia and Expo (ICME) > 1237 - 1242

2017 IEEE International Conference on Multimedia and Expo (ICME)

We propose a novel online Attentional Recurrent Neural Network (ARNN) model for visual tracking, which exploits the feature maps of Convolutional Neural Network (CNN) inside a bounding box to identify whether this target is the one appeared in previous frames. Attention mechanism is adopted for both different parts of targets and different scales of object features. The former attention model is able...

chapter

Image Visual Saliency Feature Extraction Based on Multi-Scale Tensor Space

Wang Shimin, Jiang Wenyan, Ye Jihua, Wang Mingwen, more

22017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC) > 1 > 122 - 127

2017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC)

In view of the traditional saliency detection method gets imprecise and vague region boundary, so that the detected object is not connected, the paper proposes image visual saliency feature extraction based on multi-scale tensor space. The method introduces the tensor space, using multiple low-level image features to construct the tensor space, after reducing dimension the image space structure and...

chapter

Low-resolution pedestrian detection via a novel resolution-score discriminative surface

Xiao Wang, Jun Chen, Chao Liang, Chen Chen, more

2017 IEEE International Conference on Multimedia and Expo (ICME) > 1123 - 1128

2017 IEEE International Conference on Multimedia and Expo (ICME)

Pedestrian detection, as an important task in video surveillance and forensics applications, has been widely studied. However, its performance is unsatisfactory especially in the low resolution conditions. In realistic scenarios, the size of pedestrians in the images is often small, and detection can be challenging. To solve this problem, this paper proposes a novel resolution-score discriminative...

INFONA - science communication portal

Search results

Referring Expression Generation and Comprehension via Attributes

A saliency detection model combined local and global features

Saliency prediction with scene structural guidance

Deep Spatial-Semantic Attention for Fine-Grained Sketch-Based Image Retrieval

Object retrieval in past video using bag-of-words model

Collaborative visual navigation for UAVs in blurry environment

Multi-scale Deep Learning Architectures for Person Re-identification

Wavelet-based texture model for crowd dynamic analysis

Loop closure detection for visual SLAM systems using convolutional neural network

A semiautomatic saliency model and its application to video compression

Visual Translation Embedding Network for Visual Relation Detection

Attentional Push: A Deep Convolutional Network for Augmenting Image Salience with Shared Attention Modeling in Social Scenes

Improving Interpretability of Deep Neural Networks with Semantic Information

Deep learning for multimodal-based video interestingness prediction

Fine-grained image recognition via weakly supervised click data guided bilinear CNN model

Recurrent Memory Addressing for Describing Videos

SANet: Structure-Aware Network for Visual Tracking

Learning attentional recurrent neural network for visual tracking

Image Visual Saliency Feature Extraction Based on Multi-Scale Tensor Space

Low-resolution pedestrian detection via a novel resolution-score discriminative surface

Filter options

Publication date

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options