DSENet++: A Coarse-to-Fine Framework for Enhanced Sub-Region Detection in Aerial Images

Xiangjie Wang, Haoran Jiang, Liang Chen, Junjie Zhang, Jian Zhang, Shiming Ge, Dan Zeng*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Object detection in aerial images remains challenging due to significant variations in object scales and unevenobject distributions. Previous methods typically employ coarse to-fine strategies, focusing initially on prominent objects and then finely detecting smaller ones within sub-regions likely to contain multiple objects. However, two essential factors of sub-regions, positional precision and detection difficulty, deserve further consideration. Moreover, object scale fluctuations within sub-regions produce certain false positives, especially in areas where objects are densely distributed. To address the issues, we propose an object-wise density-informed framework, DSENet++, which includes three consecutive stages termed “Discernment, Selection, Elevation.” Specifically, a sophisticated object-wise density map considers both object scales and angles to discern positionally precise sub-regions. Subsequently, sub-regions rated as high detection difficulty are selected based on density intensities and coarse detections collaboratively within the proposed Region Select Module. Afterward, the fine detector head is fine-tuned using the selected sub-regions in conjunction with a newly inserted adapter module, which enables features generated by the backbone to be more effectively processed by the detector head. To mitigate the impact of false positives, we devise a Train-by-False Positives training strategy. It collects false positives and clusters them adaptively to create novel Pseudo-positive categories and combined with original ones for retraining. The final retraining is performed on the enlarged category space to elevate the performance of fine detectors in areas where coarse detections are mediocre. Extensive experiments show that DSENet++ achieves state-of-the-art performance on three popular aerial image datasets: VisDrone, DOTA-V1.5, and SODA-A.

Original languageEnglish
JournalIEEE Transactions on Multimedia
DOIs
Publication statusAccepted/In press - 2025

Keywords

  • adapter
  • Aerial object detection
  • discernment
  • elevation
  • selection
  • train-by-false positives training strategy

Fingerprint

Dive into the research topics of 'DSENet++: A Coarse-to-Fine Framework for Enhanced Sub-Region Detection in Aerial Images'. Together they form a unique fingerprint.

Cite this