TY - JOUR
T1 - DSENet++
T2 - A Coarse-to-Fine Framework for Enhanced Sub-Region Detection in Aerial Images
AU - Wang, Xiangjie
AU - Jiang, Haoran
AU - Chen, Liang
AU - Zhang, Junjie
AU - Zhang, Jian
AU - Ge, Shiming
AU - Zeng, Dan
N1 - Publisher Copyright:
© 1999-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Object detection in aerial images remains challenging due to significant variations in object scales and unevenobject distributions. Previous methods typically employ coarse to-fine strategies, focusing initially on prominent objects and then finely detecting smaller ones within sub-regions likely to contain multiple objects. However, two essential factors of sub-regions, positional precision and detection difficulty, deserve further consideration. Moreover, object scale fluctuations within sub-regions produce certain false positives, especially in areas where objects are densely distributed. To address the issues, we propose an object-wise density-informed framework, DSENet++, which includes three consecutive stages termed “Discernment, Selection, Elevation.” Specifically, a sophisticated object-wise density map considers both object scales and angles to discern positionally precise sub-regions. Subsequently, sub-regions rated as high detection difficulty are selected based on density intensities and coarse detections collaboratively within the proposed Region Select Module. Afterward, the fine detector head is fine-tuned using the selected sub-regions in conjunction with a newly inserted adapter module, which enables features generated by the backbone to be more effectively processed by the detector head. To mitigate the impact of false positives, we devise a Train-by-False Positives training strategy. It collects false positives and clusters them adaptively to create novel Pseudo-positive categories and combined with original ones for retraining. The final retraining is performed on the enlarged category space to elevate the performance of fine detectors in areas where coarse detections are mediocre. Extensive experiments show that DSENet++ achieves state-of-the-art performance on three popular aerial image datasets: VisDrone, DOTA-V1.5, and SODA-A.
AB - Object detection in aerial images remains challenging due to significant variations in object scales and unevenobject distributions. Previous methods typically employ coarse to-fine strategies, focusing initially on prominent objects and then finely detecting smaller ones within sub-regions likely to contain multiple objects. However, two essential factors of sub-regions, positional precision and detection difficulty, deserve further consideration. Moreover, object scale fluctuations within sub-regions produce certain false positives, especially in areas where objects are densely distributed. To address the issues, we propose an object-wise density-informed framework, DSENet++, which includes three consecutive stages termed “Discernment, Selection, Elevation.” Specifically, a sophisticated object-wise density map considers both object scales and angles to discern positionally precise sub-regions. Subsequently, sub-regions rated as high detection difficulty are selected based on density intensities and coarse detections collaboratively within the proposed Region Select Module. Afterward, the fine detector head is fine-tuned using the selected sub-regions in conjunction with a newly inserted adapter module, which enables features generated by the backbone to be more effectively processed by the detector head. To mitigate the impact of false positives, we devise a Train-by-False Positives training strategy. It collects false positives and clusters them adaptively to create novel Pseudo-positive categories and combined with original ones for retraining. The final retraining is performed on the enlarged category space to elevate the performance of fine detectors in areas where coarse detections are mediocre. Extensive experiments show that DSENet++ achieves state-of-the-art performance on three popular aerial image datasets: VisDrone, DOTA-V1.5, and SODA-A.
KW - adapter
KW - Aerial object detection
KW - discernment
KW - elevation
KW - selection
KW - train-by-false positives training strategy
UR - https://www.scopus.com/pages/publications/105018862861
U2 - 10.1109/TMM.2025.3604895
DO - 10.1109/TMM.2025.3604895
M3 - Article
AN - SCOPUS:105018862861
SN - 1520-9210
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -