Search results

Items from 61 to 80 out of 844 results

chapter

Spatio-Temporal Vector of Locally Max Pooled Features for Action Recognition in Videos

Ionut Cosmin Duta, Bogdan Ionescu, Kiyoharu Aizawa, Nicu Sebe

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3205 - 3214

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We introduce Spatio-Temporal Vector of Locally Max Pooled Features (ST-VLMPF), a super vector-based encoding method specifically designed for local deep features encoding. The proposed method addresses an important problem of video understanding: how to build a video representation that incorporates the CNN features over the entire video. Feature assignment is carried out at two levels, by using the...

chapter

What's in a Question: Using Visual Questions as a Form of Supervision

Siddha Ganju, Olga Russakovsky, Abhinav Gupta

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6422 - 6431

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Collecting fully annotated image datasets is challenging and expensive. Many types of weak supervision have been explored: weak manual annotations, web search results, temporal continuity, ambient sound and others. We focus on one particular unexplored mode: visual questions that are asked about images. The key observation that inspires our work is that the question itself provides useful information...

chapter

Gaze Embeddings for Zero-Shot Image Classification

Nour Karessli, Zeynep Akata, Bernt Schiele, Andreas Bulling

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6412 - 6421

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Zero-shot image classification using auxiliary information, such as attributes describing discriminative object properties, requires time-consuming annotation by domain experts. We instead propose a method that relies on human gaze as auxiliary information, exploiting that even non-expert users have a natural ability to judge class membership. We present a data collection paradigm that involves a...

chapter

DeepPermNet: Visual Permutation Learning

Rodrigo Santa Cruz, Basura Fernando, Anoop Cherian, Stephen Gould

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6044 - 6052

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We present a principled approach to uncover the structure of visual data by solving a novel deep learning task coined visual permutation learning. The goal of this task is to find the permutation that recovers the structure of data from shuffled versions of it. In the case of natural images, this task boils down to recovering the original image from patches shuffled by an unknown permutation matrix...

chapter

Robust Joint and Individual Variance Explained

Christos Sagonas, Yannis Panagakis, Alina Leidinger, Stefanos Zafeiriou

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5739 - 5748

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Discovering the common (joint) and individual subspaces is crucial for analysis of multiple data sets, including multi-view and multi-modal data. Several statistical machine learning methods have been developed for discovering the common features across multiple data sets. The most well studied family of the methods is that of Canonical Correlation Analysis (CCA) and its variants. Even though the...

chapter

Query-Focused Video Summarization: Dataset, Evaluation, and a Memory Network Based Approach

Aidean Sharghi, Jacob S. Laurel, Boqing Gong

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2127 - 2136

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Recent years have witnessed a resurgence of interest in video summarization. However, one of the main obstacles to the research on video summarization is the user subjectivity — users have various preferences over the summaries. The subjectiveness causes at least two problems. First, no single video summarizer fits all users unless it interacts with and adapts to the individual users. Second,...

chapter

Automatic Understanding of Image and Video Advertisements

Zaeem Hussain, Mingda Zhang, Xiaozhong Zhang, Keren Ye, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 1100 - 1110

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

There is more to images than their objective physical content: for example, advertisements are created to persuade a viewer to take a certain action. We propose the novel problem of automatic advertisement understanding. To enable research on this problem, we create two datasets: an image dataset of 64,832 image ads, and a video dataset of 3,477 ads. Our data contains rich annotations encompassing...

chapter

Video Captioning with Transferred Semantic Attributes

Yingwei Pan, Ting Yao, Houqiang Li, Tao Mei

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 984 - 992

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Automatically generating natural language descriptions of videos plays a fundamental challenge for computer vision community. Most recent progress in this problem has been achieved through employing 2-D and/or 3-D Convolutional Neural Networks (CNNs) to encode video content and Recurrent Neural Networks (RNNs) to decode a sentence. In this paper, we present Long Short-Term Memory with Transferred...

chapter

Viraliency: Pooling Local Virality

Xavier Alameda-Pineda, Andrea Pilzer, Dan Xu, Nicu Sebe, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 484 - 492

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

In our overly-connected world, the automatic recognition of virality – the quality of an image or video to be rapidly and widely spread in social networks – is of crucial importance, and has recently awaken the interest of the computer vision community. Concurrently, recent progress in deep learning architectures showed that global pooling strategies allow the extraction of activation...

chapter

Deep Affordance-Grounded Sensorimotor Object Recognition

Spyridon Thermos, Georgios Th. Papadopoulos, Petros Daras, Gerasimos Potamianos

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 49 - 57

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

It is well-established by cognitive neuroscience that human perception of objects constitutes a complex process, where object appearance information is combined with evidence about the so-called object affordances, namely the types of actions that humans typically perform when interacting with them. This fact has recently motivated the sensorimotor approach to the challenging task of automatic object...

chapter

Deep Learning Human Mind for Automated Visual Classification

C. Spampinato, S. Palazzo, I. Kavasidis, D. Giordano, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4503 - 4511

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

What if we could effectively read the mind and transfer human visual capabilities to computer vision methods? In this paper, we aim at addressing this question by developing the first visual object classifier driven by human brain signals. In particular, we employ EEG data evoked by visual object stimuli combined with Recurrent Neural Networks (RNN) to learn a discriminative brain activity manifold...

chapter

Object-Aware Dense Semantic Correspondence

Fan Yang, Xin Li, Hong Cheng, Jianping Li, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4151 - 4159

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This work aims to build pixel-to-pixel correspondences between images from the same visual class but with different geometries and visual similarities. This task is particularly challenging because (i) their visual content is similar only on the high-level structure, and (ii) background clutters keep bringing in noises. To address these problems, this paper proposes an object-aware method to estimate...

chapter

Context based visual content verification

Martin Lukac, Aigerim Bazarbciyeva, Michitaka Kameyama

2017 International Conference on Information and Digital Technologies (IDT) > 234 - 239

2017 International Conference on Information and Digital Technologies (IDT)

In this paper the intermediary visual content verification method based on multi-level co-occurrences is studied. The co-occurrence statistics are in general used to determine relational properties between objects based on information collected from data. As such these measures are heavily subject to relative number of occurrences and give only limited amount of accuracy when predicting objects in...

chapter

SANet: Structure-Aware Network for Visual Tracking

Heng Fan, Haibin Ling

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2217 - 2224

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Convolutional neural network (CNN) has drawn increasing interest in visual tracking owing to its powerfulness in feature extraction. Most existing CNN-based trackers treat tracking as a classification problem. However, these trackers are sensitive to similar distractors because their CNN models mainly focus on inter-class classification. To address this problem, we use self-structure information of...

chapter

DenseTracker: A multi-task dense network for visual tracking

Fei Zhao, Ming Tang, Yi Wu, Jinqiao Wang

2017 IEEE International Conference on Multimedia and Expo (ICME) > 607 - 612

2017 IEEE International Conference on Multimedia and Expo (ICME)

How to track an arbitrary object in video is one of the main challenges in computer vision, and it has been studied for decades. Based on hand-crafted features, traditional trackers show poor discriminability for complex changes of object appearance. Recently, some trackers based on convolutional neural network (CNN) have shown some promising results by exploiting the rich convolutional features....

chapter

Extracting emotions from speech using a bag-of-visual-words approach

Evaggelos Spyrou, Theodoros Giannakopoulos, Dimitrios Sgouropoulos, Michalis Papakostas

2017 12th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP) > 80 - 83

2017 12th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)

Recognition of humans' emotions may be crucial in certain applications involving e.g., human-computer interaction, monitoring of elderly, understanding the affective state of learners during a course etc. To this goal and depending on the application and the environment, one may use physiological parameters (e.g., heart rate, brain activity etc.) which are typically obtrusive, or analyze other modalities...

chapter

Visual pollution localization through crowdsourcing and visual similarity clustering

Zuzana Kucharikova, Jakub Simko

2017 12th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP) > 26 - 31

2017 12th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)

Nowadays, many cities and communes suffer from advertisements appearing on aesthetically inappropriate or illegal places. This contamination of public space is called visual pollution. The first step in the fight against visual pollution is localization of physical advertising media (e.g., billboards) as accurately as is possible. One of the ways is to use volunteer effort through outdoor crowdsourcing...

chapter

Temporal Domain Neural Encoder for Video Representation Learning

Hao Hu, Zhaowen Wang, Joon-Young Lee, Zhe Lin, more

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2192 - 2199

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

We address the challenge of learning good video representations by explicitly modeling the relationship between visual concepts in time space. We propose a novel Temporal Preserving Recurrent Neural Network (TPRNN) that extracts and encodes visual dynamics with frame-level features as input. The proposed network architecture captures temporal dynamics by keeping track of the ordinal relationship of...

chapter

The human detection in images using the depth map

Dmitriy Tatarenkov, Dmitry Podolsky

2017 Systems of Signal Synchronization, Generating and Processing in Telecommunications (SINKHROINFO) > 1 - 4

2017 Systems of Signal Synchronization, Generating and Processing in Telecommunications (SINKHROINFO)

In today world the necessity for the autonomous mobile robots and vehicles is increasing. The safety autonomous moving demands the reliable and fast detection algorithms. The Histogram of Oriented Gradients (HOG) descriptors show significantly outperforms the existing feature sets for a human detection. Though the given method has a lot of type I errors. The amount of these errors can be decreased...

chapter

Object-Specific Style Transfer Based on Feature Map Selection Using CNNs

Ayumu Shinya, Nguyen Duc Tung, Tomohiro Harada, Ruck Thawonmas

2017 Nicograph International (NicoInt) > 88

2017 Nicograph International (NicoInt)

We propose a method for transferring an arbitrary style to only a specific object in an image. Style transfer is the process of combining the content of an image and the style of another image into a new image. Our results show that the proposed method can realize style transfer to specific object.

Keywords:
VISUALIZATION
COMPUTER VISION

Publication date

Set your own date range

Content availability

Available (839)
None (5)

Keywords

FEATURE EXTRACTION (276)
CAMERAS (151)
COMPUTATIONAL MODELING (140)
IMAGE COLOR ANALYSIS (125)
TRAINING (107)
OBJECT DETECTION (96)
HUMANS (85)
IMAGE SEGMENTATION (84)
OBJECT RECOGNITION (77)
HISTOGRAMS (76)
PIXEL (73)
CONFERENCES (71)
PATTERN RECOGNITION (68)
DATA MINING (63)
SHAPE (62)
ROBUSTNESS (61)
IMAGE MOTION ANALYSIS (59)
TARGET TRACKING (53)
IMAGE PROCESSING (52)
IMAGE CLASSIFICATION (51)
SEMANTICS (49)
TRACKING (49)
IMAGE EDGE DETECTION (46)
IMAGE RETRIEVAL (45)
IMAGE RECOGNITION (44)
ACCURACY (41)
ESTIMATION (40)
MACHINE VISION (40)
THREE DIMENSIONAL DISPLAYS (39)
DETECTORS (38)
VIDEO SIGNAL PROCESSING (38)
DATABASES (37)
SUPPORT VECTOR MACHINES (37)
MATHEMATICAL MODEL (35)
KERNEL (34)
IMAGE REPRESENTATION (33)
VOCABULARY (33)
IMAGE RESOLUTION (31)
LIGHTING (31)
COMPUTERS (30)
IMAGE COLOUR ANALYSIS (30)
FACE (29)
LEARNING (ARTIFICIAL INTELLIGENCE) (29)
DISTANCE MEASUREMENT (28)
VISUAL TRACKING (28)
ALGORITHM DESIGN AND ANALYSIS (27)
CLASSIFICATION ALGORITHMS (27)
IMAGE MATCHING (27)
CORRELATION (26)
IMAGE SEQUENCES (25)
VISUAL ATTENTION (25)
NAVIGATION (24)
COMPUTER ARCHITECTURE (23)
IMAGE RECONSTRUCTION (23)
MACHINE LEARNING (23)
OPTICAL IMAGING (23)
ROBOTS (23)
THREE-DIMENSIONAL DISPLAYS (23)
VIDEOS (23)
VISUAL PERCEPTION (23)
VEHICLES (22)
NEURAL NETWORKS (21)
VECTORS (21)
COMPUTER GRAPHICS (20)
INSPECTION (20)
STEREO IMAGE PROCESSING (20)
SURVEILLANCE (20)
TRAJECTORY (20)
CLUSTERING ALGORITHMS (19)
EQUATIONS (19)
OBJECT TRACKING (19)
STEREO VISION (19)
BIOLOGICAL SYSTEM MODELING (18)
DATA VISUALISATION (18)
HIDDEN MARKOV MODELS (18)
REAL-TIME SYSTEMS (18)
SENSORS (18)
SOLID MODELING (18)
TRANSFORMS (18)
VIDEO SURVEILLANCE (18)
BRAIN MODELING (17)
CONTEXT (17)
DICTIONARIES (17)
MOBILE ROBOTS (17)
OPTIMIZATION (17)
PARTICLE FILTER (17)
ROBOT VISION (17)
STREAMING MEDIA (17)
COLOR (16)
ELECTRONIC MAIL (16)
ENCODING (16)
ENTROPY (16)
HUMAN VISUAL SYSTEM (16)
NEURONS (16)
OBSERVERS (16)
PARTICLE FILTERING (NUMERICAL METHODS) (16)
REAL TIME SYSTEMS (16)
SALIENCY MAP (16)
more

INFONA - science communication portal

Search results

Spatio-Temporal Vector of Locally Max Pooled Features for Action Recognition in Videos

What's in a Question: Using Visual Questions as a Form of Supervision

Gaze Embeddings for Zero-Shot Image Classification

DeepPermNet: Visual Permutation Learning

Robust Joint and Individual Variance Explained

Query-Focused Video Summarization: Dataset, Evaluation, and a Memory Network Based Approach

Automatic Understanding of Image and Video Advertisements

Video Captioning with Transferred Semantic Attributes

Viraliency: Pooling Local Virality

Deep Affordance-Grounded Sensorimotor Object Recognition

Deep Learning Human Mind for Automated Visual Classification

Object-Aware Dense Semantic Correspondence

Context based visual content verification

SANet: Structure-Aware Network for Visual Tracking

DenseTracker: A multi-task dense network for visual tracking

Extracting emotions from speech using a bag-of-visual-words approach

Visual pollution localization through crowdsourcing and visual similarity clustering

Temporal Domain Neural Encoder for Video Representation Learning

The human detection in images using the depth map

Object-Specific Style Transfer Based on Feature Map Selection Using CNNs

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options