Weakly supervised detection with decoupled attention-based deep representation

Wenhui Jiang; Zhicheng Zhao; Fei Su

doi:10.1007/s11042-017-5087-x

Weakly supervised detection with decoupled attention-based deep representation

Wenhui Jiang, Zhicheng Zhao, Fei Su

Source

Multimedia Tools and Applications > 2018 > 77 > 3 > 3261-3277

Abstract

Training object detectors with only image-level annotations is an important problem with a variety of applications. However, due to the deformable nature of objects, a target object delineated by a bounding box always includes irrelevant context and occlusions, which causes large intra-class object variations and ambiguity in object-background distinction. For this reason, identifying the object of interest from a substantial amount of cluttered backgrounds is very challenging. In this paper, we propose a decoupled attention-based deep model to optimize region-based object representation. Different from existing approaches posing object representation in a single-tower model, our proposed network decouples object representation into two separate modules, i.e., image representation and attention localization. The image representation module captures content-based semantic representation, while the attention localization module regresses an attention map which simultaneously highlights the locations of the discriminative object parts and down weights the irrelevant backgrounds presented in the image. The combined representation alleviates the impact from the noisy context and occlusions inside an object bounding box. As a result, object-background ambiguity can be largely reduced and background regions can be suppressed effectively. In addition, the proposed object representation model can be seamlessly integrated into a state-of-the-art weakly supervised detection framework, and the entire model can be trained end-to-end. We extensively evaluate the detection performance on the PASCAL VOC 2007, VOC 2010 and VOC2012 datasets. Experimental results demonstrate that our approach effectively improves weakly supervised object detection.

Identifiers

journal ISSN :	1380-7501
journal e-ISSN :	1573-7721
DOI	10.1007/s11042-017-5087-x

Authors

Wenhui Jiang

Beijing University of Posts and Telecommunications, School of Information and Communication Engineering, Beijing, China

Zhicheng Zhao

Beijing University of Posts and Telecommunications, School of Information and Communication Engineering, Beijing, China
Beijing University of Posts and Telecommunications, Beijing Key Laboratory of Network System and Network Culture, Beijing, China

Fei Su

Beijing University of Posts and Telecommunications, School of Information and Communication Engineering, Beijing, China
Beijing University of Posts and Telecommunications, Beijing Key Laboratory of Network System and Network Culture, Beijing, China

Keywords

Weak supervision Object detection Deep learning Attention model

Additional information

Publication languages: English

Data set: Springer

Publisher

Springer US

Fields of science

No field of science has been suggested yet.

article

Read online
Download
Add to read later
Add to collection
Add to followed
Share

Export to bibliography


Assign to other user
	×
Wrong email address

INFONA - science communication portal

Weakly supervised detection with decoupled attention-based deep representation $("#expandableTitles").expandable();

Source

Abstract

Identifiers

Authors

User assignment

Assignment remove confirmation

You're going to remove this assignment. Are you sure?

Wenhui Jiang

Zhicheng Zhao

Fei Su

Keywords

Additional information

Publisher

Fields of science

Fields of science

Share

Export to bibliography

Reporting an error / abuse

Sending the report failed

Accessibility options

Weakly supervised detection with decoupled attention-based deep representation