Tao Mei

chapter

Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition

Heliang Zheng, Jianlong Fu, Tao Mei, Jiebo Luo

2017 IEEE International Conference on Computer Vision (ICCV) > 5219 - 5227

2017 IEEE International Conference on Computer Vision (ICCV)

Recognizing fine-grained categories (e.g., bird species) highly relies on discriminative part localization and part-based fine-grained feature learning. Existing approaches predominantly solve these challenges independently, while neglecting the fact that part localization (e.g., head of a bird) and fine-grained feature learning (e.g., head shape) are mutually correlated. In this paper, we propose...

chapter

Joint Detection and Recounting of Abnormal Events by Learning Deep Generic Knowledge

Ryota Hinami, Tao Mei, Shin'ichi Satoh

2017 IEEE International Conference on Computer Vision (ICCV) > 3639 - 3647

2017 IEEE International Conference on Computer Vision (ICCV)

This paper addresses the problem of joint detection and recounting of abnormal events in videos. Recounting of abnormal events, i.e., explaining why they are judged to be abnormal, is an unexplored but critical task in video surveillance, because it helps human observers quickly judge if they are false alarms or not. To describe the events in the human-understandable form for event recounting, learning...

chapter

Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks

Zhaofan Qiu, Ting Yao, Tao Mei

2017 IEEE International Conference on Computer Vision (ICCV) > 5534 - 5542

2017 IEEE International Conference on Computer Vision (ICCV)

Convolutional Neural Networks (CNN) have been regarded as a powerful class of models for image recognition problems. Nevertheless, it is not trivial when utilizing a CNN for learning spatio-temporal video representation. A few studies have shown that performing 3D convolutions is a rewarding approach to capture both spatial and temporal dimensions in videos. However, the development of a very deep...

chapter

Boosting Image Captioning with Attributes

Ting Yao, Yingwei Pan, Yehao Li, Zhaofan Qiu, more

2017 IEEE International Conference on Computer Vision (ICCV) > 4904 - 4912

2017 IEEE International Conference on Computer Vision (ICCV)

Automatically describing an image with a natural language has been an emerging challenge in both fields of computer vision and natural language processing. In this paper, we present Long Short-Term Memory with Attributes (LSTM-A) - a novel architecture that integrates attributes into the successful Convolutional Neural Networks (CNNs) plus Recurrent Neural Networks (RNNs) image captioning framework,...

chapter

Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition

Jianlong Fu, Heliang Zheng, Tao Mei

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4476 - 4484

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Recognizing fine-grained categories (e.g., bird species) is difficult due to the challenges of discriminative region localization and fine-grained feature learning. Existing approaches predominantly solve these challenges independently, while neglecting the fact that region detection and fine-grained feature learning are mutually correlated and thus can reinforce each other. In this paper, we propose...

chapter

Multi-level Attention Networks for Visual Question Answering

Dongfei Yu, Jianlong Fu, Tao Mei, Yong Rui

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4187 - 4195

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Inspired by the recent success of text-based question answering, visual question answering (VQA) is proposed to automatically answer natural language questions with the reference to a given image. Compared with text-based QA, VQA is more challenging because the reasoning process on visual domain needs both effective semantic embedding and fine-grained visual understanding. Existing approaches predominantly...

chapter

Deep Quantization: Encoding Convolutional Activations with Deep Generative Model

Zhaofan Qiu, Ting Yao, Tao Mei

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4085 - 4094

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Deep convolutional neural networks (CNNs) have proven highly effective for visual recognition, where learning a universal representation from activations of convolutional layer plays a fundamental problem. In this paper, we present Fisher Vector encoding with Variational Auto-Encoder (FV-VAE), a novel deep architecture that quantizes the local activations of convolutional layer in a deep generative...

chapter

Incorporating Copying Mechanism in Image Captioning for Learning Novel Objects

Ting Yao, Yingwei Pan, Yehao Li, Tao Mei

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5263 - 5271

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Image captioning often requires a large set of training image-sentence pairs. In practice, however, acquiring sufficient training pairs is always expensive, making the recent captioning models limited in their ability to describe objects outside of training corpora (i.e., novel objects). In this paper, we present Long Short-Term Memory with Copying Mechanism (LSTM-C) — a new architecture...

chapter

Video Captioning with Transferred Semantic Attributes

Yingwei Pan, Ting Yao, Houqiang Li, Tao Mei

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 984 - 992

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Automatically generating natural language descriptions of videos plays a fundamental challenge for computer vision community. Most recent progress in this problem has been achieved through employing 2-D and/or 3-D Convolutional Neural Networks (CNNs) to encode video content and Recurrent Neural Networks (RNNs) to decode a sentence. In this paper, we present Long Short-Term Memory with Transferred...

chapter

Design and experiment of a tank-like wall-climbing robot using fibril dry adhesives

Yanan Zhang, Yongjie Zhang, Xuan Wu, Tao Mei

2016 IEEE International Conference on Mechatronics and Automation > 671 - 676

2016 IEEE International Conference on Mechatronics and Automation

This paper presents the design, analyses, and fabrication of a tank-like wall-climbing robot using gecko-inspired dry adhesives. The robot uses customized timing adhesive belts, which is flexible as well as patterned using MEMS techniques. The Kendall strip tape model is modified, considering features of the timing belt, to analyze the peeling process of the viscoelastic tread. The relationship between...

chapter

Design and experiment of a bioinspired wall-climbing robot using spiny grippers

Gaowei Liu, Yanwei Liu, Xiaojie Wang, Xuan Wu, more

2016 IEEE International Conference on Mechatronics and Automation > 665 - 670

2016 IEEE International Conference on Mechatronics and Automation

This paper proposes the design and experiment of a bioinspired wall-climbing robot with spiny arrays. Inspired by the Serica orientalis Motschulsky's tarsal system, a spiny structure is designed, and the robot's foot which has two grippers using the structure is designed. An inchworm-like gait is employed and its trajectory is planned. The robot's foot as well as the whole prototype is fabricated...

chapter

A representative-based framework for parsing and summarizing events in surveillance videos

Zhen Ju, Weiyao Lin, Michael Ying Yang, Bo Zhang, more

2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW) > 1 - 6

2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)

This paper presents a novel representative-based framework for parsing and summarizing events in long surveillance videos. The proposed framework first extracts object blob sequences and utilizes them to represent events in a surveillance video. Then, a sequence filtering strategy is introduced which detects and eliminates noisy blob sequences based on their spatial and temporal characteristics. After...

chapter

Automatic suggestion of presentation image for storytelling

Yu Liu, Tao Mei, Chang Wen Chen

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

Digital storytelling applications are playing an increasingly important role in people's daily life. In contemporary storytelling applications such as PowerPoint presentation and macro/micro blogs, good presentation images are always highly desired by content creators to boost their presentation in an intuitive and attractive way. Existing studies, however, have not yet addressed the challenging problem...

chapter

A bridge crack image detection and classification method based On climbing robot

Yao Chen, Tao Mei, Xiaojie Wang, Feng Li

2016 35th Chinese Control Conference (CCC) > 4037 - 4042

2016 35th Chinese Control Conference (CCC)

Traditional bridge crack detection methods are of high cost and high risk. We propose a bridge crack detection and classification method based on a climbing robot using image analysis with a miniature camera mounted on the robot to collect images. First, the motion blur of acquired image is removed by Wiener filtering method. Second, wavelet transform is used to enhance fracture of the crack in the...

chapter

Jointly Modeling Embedding and Translation to Bridge Video and Language

Yingwei Pan, Tao Mei, Ting Yao, Houqiang Li, more

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4594 - 4602

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Automatically describing video content with natural language is a fundamental challenge of computer vision. Re-current Neural Networks (RNNs), which models sequence dynamics, has attracted increasing attention on visual interpretation. However, most existing approaches generate a word locally with the given previous words and the visual content, while the relationship between sentence semantics and...

chapter

Highlight Detection with Pairwise Deep Ranking for First-Person Video Summarization

Ting Yao, Tao Mei, Yong Rui

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 982 - 990

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

The emergence of wearable devices such as portable cameras and smart glasses makes it possible to record life logging first-person videos. Browsing such long unstructured videos is time-consuming and tedious. This paper studies the discovery of moments of user's major or special interest (i.e., highlights) in a video, for generating the summarization of first-person videos. Specifically, we propose...

chapter

You Lead, We Exceed: Labor-Free Video Concept Learning by Jointly Exploiting Web Videos and Images

Chuang Gan, Ting Yao, Kuiyuan Yang, Yi Yang, more

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 923 - 932

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Video concept learning often requires a large set oftraining samples. In practice, however, acquiring noise-free training labels with sufficient positive examples is very expensive. A plausible solution for training data collection is by sampling from the vast quantities of images and videos on the Web. Such a solution is motivated by the assumption that the retrieved images or videos are highly correlated...

chapter

MSR-VTT: A Large Video Description Dataset for Bridging Video and Language

Jun Xu, Tao Mei, Ting Yao, Yong Rui

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5288 - 5296

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

While there has been increasing interest in the task of describing video with natural language, current computer vision algorithms are still severely limited in terms of the variability and complexity of the videos and their associated language that they can recognize. This is in part due to the simplicity of current benchmarks, which mostly focus on specific fine-grained domains with limited videos...

chapter

MILC²: A Multi-Layer Multi-Instance Learning Approach to Video Concept Detection

Zhiwei Gu, Tao Mei, Jinhui Tang, Xiuqing Wu, more

Lecture Notes in Computer Science > Advances in Multimedia Modeling > Media Understanding > 24-34

Video is a kind of structured data with multi-layer (ML) information, e.g., a shot is consisted of three layers including shot, key-frame, and region. Moreover, multi-instance (MI) relation is embedded along the consecutive layers. Both the ML structure and MI relation are essential for video concept detection. The previous work [5] dealt with ML structure and MI relation by constructing a MLMI kernel...

chapter

Design of Quadruped Robot Based Neural Network

Lei Sun, Max Q. -H. Meng, Wanming Chen, Huawei Liang, more

Lecture Notes in Computer Science > Advances in Neural Networks – ISNN 2007 > Robotics > 843-851

The paper proposed a method for a quadruped robot control system based Central Pattern Generator (CPG) and fuzzy neural networks (FNN). The common approach for the control of a quadruped robot includes two methods mainly. One is the CPG that is based the bionics, the other is the dynamic control that is based the model of quadruped robot. The control result of CPG is decided by the gait data of the...

INFONA - science communication portal

Search results for: Tao Mei

Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition

Joint Detection and Recounting of Abnormal Events by Learning Deep Generic Knowledge

Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks

Boosting Image Captioning with Attributes

Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition

Multi-level Attention Networks for Visual Question Answering

Deep Quantization: Encoding Convolutional Activations with Deep Generative Model

Incorporating Copying Mechanism in Image Captioning for Learning Novel Objects

Video Captioning with Transferred Semantic Attributes

Design and experiment of a tank-like wall-climbing robot using fibril dry adhesives

Design and experiment of a bioinspired wall-climbing robot using spiny grippers

A representative-based framework for parsing and summarizing events in surveillance videos

Automatic suggestion of presentation image for storytelling

A bridge crack image detection and classification method based On climbing robot

Jointly Modeling Embedding and Translation to Bridge Video and Language

Highlight Detection with Pairwise Deep Ranking for First-Person Video Summarization

You Lead, We Exceed: Labor-Free Video Concept Learning by Jointly Exploiting Web Videos and Images

MSR-VTT: A Large Video Description Dataset for Bridging Video and Language

MILC²: A Multi-Layer Multi-Instance Learning Approach to Video Concept Detection

Design of Quadruped Robot Based Neural Network

Filter options

Publication date

Content availability

Keywords

Data set

INFONA - science communication portal

Search results for: Tao Mei

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Data set

Reporting an error / abuse

Sending the report failed

Accessibility options