Search results

chapter

Zero-Shot Action Recognition with Error-Correcting Output Codes

Jie Qin, Li Liu, Ling Shao, Fumin Shen, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1042 - 1051

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Recently, zero-shot action recognition (ZSAR) has emerged with the explosive growth of action categories. In this paper, we explore ZSAR from a novel perspective by adopting the Error-Correcting Output Codes (dubbed ZSECOC). Our ZSECOC equips the conventional ECOC with the additional capability of ZSAR, by addressing the domain shift problem. In particular, we learn discriminative ZSECOC for seen...

chapter

Improving Interpretability of Deep Neural Networks with Semantic Information

Yinpeng Dong, Hang Su, Jun Zhu, Bo Zhang

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 975 - 983

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Interpretability of deep neural networks (DNNs) is essential since it enables users to understand the overall strengths and weaknesses of the models, conveys an understanding of how the models will behave in the future, and how to diagnose and correct potential problems. However, it is challenging to reason about what a DNN actually does due to its opaque or black-box nature. To address this issue,...

chapter

StyleNet: Generating Attractive Visual Captions with Styles

Chuang Gan, Zhe Gan, Xiaodong He, Jianfeng Gao, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 955 - 964

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We propose a novel framework named StyleNet to address the task of generating attractive captions for images and videos with different styles. To this end, we devise a novel model component, named factored LSTM, which automatically distills the style factors in the monolingual text corpus. Then at runtime, we can explicitly control the style in the caption generation process so as to produce attractive...

chapter

Video Captioning with Transferred Semantic Attributes

Yingwei Pan, Ting Yao, Houqiang Li, Tao Mei

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 984 - 992

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Automatically generating natural language descriptions of videos plays a fundamental challenge for computer vision community. Most recent progress in this problem has been achieved through employing 2-D and/or 3-D Convolutional Neural Networks (CNNs) to encode video content and Recurrent Neural Networks (RNNs) to decode a sentence. In this paper, we present Long Short-Term Memory with Transferred...

chapter

Colorization as a Proxy Task for Visual Understanding

Gustav Larsson, Michael Maire, Gregory Shakhnarovich

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 840 - 849

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We investigate and improve self-supervision as a drop-in replacement for ImageNet pretraining, focusing on automatic colorization as the proxy task. Self-supervised training has been shown to be more promising for utilizing unlabeled data than other, traditional unsupervised learning methods. We build on this success and evaluate the ability of our self-supervised network in several contexts. On VOC...

chapter

Object-Aware Dense Semantic Correspondence

Fan Yang, Xin Li, Hong Cheng, Jianping Li, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4151 - 4159

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This work aims to build pixel-to-pixel correspondences between images from the same visual class but with different geometries and visual similarities. This task is particularly challenging because (i) their visual content is similar only on the high-level structure, and (ii) background clutters keep bringing in noises. To address these problems, this paper proposes an object-aware method to estimate...

chapter

Multi-modal Topic Modelling and Summarization with Dense Block Detection: A Review

Prajakta Sonone, A. V. Deorankar

2017 International Conference on Recent Trends in Electrical, Electronics and Computing Technologies (ICRTEECT) > 177 - 182

2017 International Conference on Recent Trends in Electrical, Electronics and Computing Technologies (ICRTEECT)

There has been incredible growth of events over the internet in recent years. Google has become the giant source of knowledge for any event which has happened or happening over the internet. Some networking sites such as face book, micro blogging sites such as twitter are evolved with time and became the highly used sites over the internet. Various E-commerce websites such as Amazon, Ebay, Flipkart...

chapter

Visual data mining applied on earth observation datasets

Andreea Griparis, Florin-Andrei Georgescu, Mihai Datcu

2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS) > 566 - 569

IGARSS 2017 - 2017 IEEE International Geoscience and Remote Sensing Symposium

In the quest of developing more accurate methodologies for Earth Observation (EO) image retrieval, visualization and information content exploration, a deep understanding of the data being analyzed is needed. In this paper we propose a simple but efficient visual data mining methodology that can be used for these tasks. Our solution consists in a patch-based feature extraction to derive image features...

chapter

Visual-Interactive k-NDN Method (VIK): A Novel Approach to Visualize and Interact with Content-Based Image Retrieval Systems Regarding Similarity and Diversity

Rafael Loosli Dias, Steve Ataky Tsham Mpinda, Renato Bueno, Marcela Xavier Ribeiro

2017 21st International Conference Information Visualisation (IV) > 72 - 77

2017 21st International Conference Information Visualisation (IV)

Digital imaging plays an important role in many human activities, such as agriculture and forest management, earth sciences, urban planning, weather forecasting, medical imaging and so on. Processing, exploring and visualizing the inconceivable volumes of such images has turned out to be progressively troublesome. The Content-Based Image Retrieval (CBIR) remains an important issue that finds potential...

chapter

An Ontology Driven Framework for Personalized Itinerary Visualization

Essayeh Aroua, Hajer Baazaoui-Zghal, Mourad Abed

2017 21st International Conference Information Visualisation (IV) > 152 - 157

2017 21st International Conference Information Visualisation (IV)

Retrieving information based on the users’ preferences and profiles represent a challenging issue to overcome. Moreover, in the public transport field, this task becomes increasingly complex due to the heterogeneous data fetched from various sources. Though, ontologies have emerged in retrieving information field to reduce this complexity. This paper describes a visual framework aiming...

chapter

Visually Supporting Image Annotation Based on Visual Features and Ontologies

Jalila Filali, Hajer Baazaoui Zghal, Jean Martinet

2017 21st International Conference Information Visualisation (IV) > 182 - 187

2017 21st International Conference Information Visualisation (IV)

Automatic Image Annotation (AIA) is a challenging problem in the field of image retrieval, and several methods have been proposed. However, visually supporting this important tasks and reducing the semantic gap between low-level image features and high-level semantic concepts still remains a key issue. In this paper, we propose a visually supporting image annotation framework based on visual features...

chapter

Exploiting convolutional representations for multiscale human settlement detection: Preliminary results

Dalton Lunga, Dilip Patlolla, Hsiuhan Lexie Yang, Jeanette Weaver, more

2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS) > 3779 - 3782

IGARSS 2017 - 2017 IEEE International Geoscience and Remote Sensing Symposium

We test this premise and explore representation spaces from a single deep convolutional network and their visualization to argue for a novel unified feature extraction framework. The objective is to utilize and re-purpose trained feature extractors without the need for network retraining on three remote sensing tasks i.e. superpixel mapping, pixel-level segmentation and semantic based image visualization...

chapter

Context based visual content verification

Martin Lukac, Aigerim Bazarbciyeva, Michitaka Kameyama

2017 International Conference on Information and Digital Technologies (IDT) > 234 - 239

2017 International Conference on Information and Digital Technologies (IDT)

In this paper the intermediary visual content verification method based on multi-level co-occurrences is studied. The co-occurrence statistics are in general used to determine relational properties between objects based on information collected from data. As such these measures are heavily subject to relative number of occurrences and give only limited amount of accuracy when predicting objects in...

chapter

Estimating relative depth in single images via rankboost

Ralph Ewerth, Matthias Springstein, Eric Muller, Alexander Balz, more

2017 IEEE International Conference on Multimedia and Expo (ICME) > 919 - 924

2017 IEEE International Conference on Multimedia and Expo (ICME)

In this paper, we present a novel approach to estimate the relative depth of regions in monocular images. There are several contributions. First, the task of monocular depth estimation is considered as a learning-to-rank problem which offers several advantages compared to regression approaches. Second, monocular depth clues of human perception are modeled in a systematic manner. Third, we show that...

chapter

Representing word image using visual word embeddings and RNN for keyword spotting on historical document images

Hongxi Wei, Hui Zhang, Guanglai Gao

2017 IEEE International Conference on Multimedia and Expo (ICME) > 1368 - 1373

2017 IEEE International Conference on Multimedia and Expo (ICME)

Visual words of Bag-of-Visual-Words (BoVW) framework are independent each other, which results in not only discarding spatial orders between visual words but also lacking semantic information. This study is inspired by word embeddings that a similar embedding procedure is applied to a large number of visual words. By this way, the corresponding embedding vectors of the visual words can be formulated...

chapter

Fashion analysis with a subordinate attribute classification network

Huijing Zhan, Boxin Shi, Alex C. Kot

2017 IEEE International Conference on Multimedia and Expo (ICME) > 1506 - 1511

2017 IEEE International Conference on Multimedia and Expo (ICME)

In this paper we deal with two image-based object search tasks in the fashion domain, clothing attribute prediction and cross-domain shoe retrieval. Clothing attribute prediction is about describing the appearances of clothes via semantic attributes and cross-domain shoe retrieval aims at retrieving the same shoe items from online stores given a daily life shoe photo. We jointly solve these two problems...

chapter

Attribute hashing for zero-shot image retrieval

Yahui Xu, Yang Yang, Fumin Shen, Xing Xu, more

2017 IEEE International Conference on Multimedia and Expo (ICME) > 133 - 138

2017 IEEE International Conference on Multimedia and Expo (ICME)

Hashing has been recognized as one of the most promising ways in indexing and retrieving high-dimensional data due to the excellent merits in efficiency and effectiveness. Nevertheless, most existing approaches inevitably suffer from the problem of “semantic gap”, especially when facing the rapid evolution of newly-emerging “unseen” categories on the Web. In this work, we propose an innovative approach,...

chapter

Temporally Steered Gaussian Attention for Video Understanding

Shagan Sah, Thang Nguyen, Miguel Dominguez, Felipe Petroski Such, more

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2208 - 2216

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Recent advances in video understanding are enabling incredible developments in video search, summarization, automatic captioning and human computer interaction. Attention mechanisms are a powerful way to steer focus onto different sections of the video. Existing mechanisms are driven by prior training probabilities and require input instances of identical temporal duration. We introduce an intuitive...

chapter

Recurrent Memory Addressing for Describing Videos

Arnav Kumar Jain, Abhinav Agarwalla, Kumar Krishna Agrawal, Pabitra Mitra

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2200 - 2207

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

In this paper, we introduce Key-Value Memory Networks to a multimodal setting and a novel key-addressing mechanism to deal with sequence-to-sequence models. The proposed model naturally decomposes the problem of video captioning into vision and language segments, dealing with them as key-value pairs. More specifically, we learn a semantic embedding (v) corresponding to each frame (k) in the video,...

chapter

SaR-WEB: A Semantic Web Tool to Support Search as Learning Practices and Cross-Language Results on the Web

Davide Taibi, Giovanni Fulantelli, Ivana Marenzi, Wolfgang Nejdl, more

2017 IEEE 17th International Conference on Advanced Learning Technologies (ICALT) > 522 - 524

2017 IEEE 17th International Conference on Advanced Learning Technologies (ICALT)

In this paper, we present SaR-Web, a multimodal web search tool that provides automatic support to searching as learning processes. Inspired by the work of Richard Rogers and the Digital Methods Initiative, SaR-Web compares the results of queries across search engine language domains, and visualizes search results with a semantic added value, thus facilitating cross-linguistic and cross-cultural comparisons...

INFONA - science communication portal

Search results

Zero-Shot Action Recognition with Error-Correcting Output Codes

Improving Interpretability of Deep Neural Networks with Semantic Information

StyleNet: Generating Attractive Visual Captions with Styles

Video Captioning with Transferred Semantic Attributes

Colorization as a Proxy Task for Visual Understanding

Object-Aware Dense Semantic Correspondence

Multi-modal Topic Modelling and Summarization with Dense Block Detection: A Review

Visual data mining applied on earth observation datasets

Visual-Interactive k-NDN Method (VIK): A Novel Approach to Visualize and Interact with Content-Based Image Retrieval Systems Regarding Similarity and Diversity

An Ontology Driven Framework for Personalized Itinerary Visualization

Visually Supporting Image Annotation Based on Visual Features and Ontologies

Exploiting convolutional representations for multiscale human settlement detection: Preliminary results

Context based visual content verification

Estimating relative depth in single images via rankboost

Representing word image using visual word embeddings and RNN for keyword spotting on historical document images

Fashion analysis with a subordinate attribute classification network

Attribute hashing for zero-shot image retrieval

Temporally Steered Gaussian Attention for Video Understanding

Recurrent Memory Addressing for Describing Videos

SaR-WEB: A Semantic Web Tool to Support Search as Learning Practices and Cross-Language Results on the Web

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options