TY - GEN
T1 - DSENet
T2 - 2024 IEEE International Conference on Multimedia and Expo, ICME 2024
AU - Jiang, Haoran
AU - Wang, Xiangjie
AU - Zhang, Junjie
AU - Zhang, Jian
AU - Zeng, Dan
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Object detection in aerial images remains formidable due to substantial object scale variations, and uneven object distributions. Previous methods widely adopt the coarse-to-fine methodology where detectors focus on large-scale objects coarsely. Sub-regions that contain densely distributed small ones are captured and detected finely. However, two pivotal assessment factors of sub-regions, positional precision, and detection difficulty, deserve further consideration. In this paper, we propose an object-wise density-informed DSENet including consecutive stages termed "Discernment, Selection, Elevation ". Specifically, the sophisticated object-wise density map that considers both object scales and angles, helps discern more positional-precise sub-regions. Then sub-regions with high detection difficulty are selected based on density intensities and coarse detections collaboratively. Finally, the fine detector head instead of the full detector, fine-tuned with selected sub-regions efficiently, elevates what and where coarse detections are mediocre. Extensive experiments show that DSENet achieves state-of-the-art performance on two popular aerial image datasets, VisDrone and DOTA-V1.5.
AB - Object detection in aerial images remains formidable due to substantial object scale variations, and uneven object distributions. Previous methods widely adopt the coarse-to-fine methodology where detectors focus on large-scale objects coarsely. Sub-regions that contain densely distributed small ones are captured and detected finely. However, two pivotal assessment factors of sub-regions, positional precision, and detection difficulty, deserve further consideration. In this paper, we propose an object-wise density-informed DSENet including consecutive stages termed "Discernment, Selection, Elevation ". Specifically, the sophisticated object-wise density map that considers both object scales and angles, helps discern more positional-precise sub-regions. Then sub-regions with high detection difficulty are selected based on density intensities and coarse detections collaboratively. Finally, the fine detector head instead of the full detector, fine-tuned with selected sub-regions efficiently, elevates what and where coarse detections are mediocre. Extensive experiments show that DSENet achieves state-of-the-art performance on two popular aerial image datasets, VisDrone and DOTA-V1.5.
KW - Aerial object detection
KW - Discernment
KW - DOTA-V1.5
KW - Elevation
KW - Selection
KW - VisDrone
UR - http://www.scopus.com/inward/record.url?scp=85206569426&partnerID=8YFLogxK
U2 - 10.1109/ICME57554.2024.10688108
DO - 10.1109/ICME57554.2024.10688108
M3 - Conference Proceeding
AN - SCOPUS:85206569426
T3 - Proceedings - IEEE International Conference on Multimedia and Expo
BT - 2024 IEEE International Conference on Multimedia and Expo, ICME 2024
PB - IEEE Computer Society
Y2 - 15 July 2024 through 19 July 2024
ER -