Multi-view contextual adaptation network for weakly supervised object detection in remote sensing images

Binfeng Ye, Junjie Zhang*, Yutao Rao, Rui Gao, Dan Zeng

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Weakly supervised learning plays a pivotal role in the field of object detection, i.e. Weakly supervised object detection (WSOD), significantly reducing annotation costs relying on image-level labels. However, WSOD exhibits certain limitations. Typically, they tend to identify the most easily recognizable local regions within targets, posing challenges in accurately delineating the boundaries of targets. Moreover, the presence of multiple instances of the same class in adjacent locations complicates the effective distinction between multiple objects within the same category. On the other hand, the complex backgrounds and dense distribution of targets in remote sensing images (RSI) further exacerbate the difficulty of weakly supervised detection. To address the above issues, we propose a model termed the Multi-View Contextual Adaptation Network (VCANet). Building on the classic Online Instance Classifier Refinement (OICR) framework, we propose to incorporate an contextual adaptation perception, within a multi-view learning framework, and integrate a pseudo-label filtering process. The contextual adaptation perception utilizes the surrounding environment information to enhance localization capabilities, guiding the model to prioritize target objects by referring to their spatially neighbouring pixels. Multi-view learning manufactures additional constraints from diverse perspectives, thereby revealing objects that might be overlooked due to the weak supervision in a single view. The pseudo-label filtering process eliminates inaccurate pseudo-labels by identifying reliable foregrounds to mitigate overlapping proposals during the label propagation. On challenging datasets NWPU VHR-10.v2 and DIOR, we achieve promising results with mAP of 62.3% and 28.2%, respectively, surpassing existing benchmarks.

Original languageEnglish
Pages (from-to)4344-4366
Number of pages23
JournalInternational Journal of Remote Sensing
Volume45
Issue number13
DOIs
Publication statusPublished - 2024
Externally publishedYes

Keywords

  • contextual adaptation
  • multi-view learning
  • remote sensing image
  • Weakly supervised object detection

Fingerprint

Dive into the research topics of 'Multi-view contextual adaptation network for weakly supervised object detection in remote sensing images'. Together they form a unique fingerprint.

Cite this