Dual-Domain Feature-Guided Task Alignment for Enhanced Small Object Detection

Fangrui Guo, Junwei Wu, Quan Zhang*

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Small object detection is a critical challenge in Unmanned Aerial Vehicles (UAVs) due to the limited pixel representation of small objects and the impact of successive pooling operations, which frequently results in the disappearance of small objects within intricate backgrounds. To tackle this issue, we propose the Small Object Enhancement Pyramid (SOEP) module, which first transforms feature representations (i.e., in the spatial domain) into the frequency domain to better capture small objects typically characterized by high-frequency components. These feature representations are then fused in the spatial domain using a frequency-based attention map, enhancing small object representations by integrating information from both complementary domains. Furthermore, we introduce a Task Aligned Head (TAH) that integrates classification and localization tasks interactively, reducing the misalignment that occurs when these tasks are learned independently, particularly in the context of small objects. Experimental results on the Visdrone dataset verify that our proposed method (D2FTA) outperforms the baseline method by 12.7%, 14.19% on mAP0.5 and mAP0.5:0.95.

Original languageEnglish
Title of host publication2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Proceedings
EditorsBhaskar D Rao, Isabel Trancoso, Gaurav Sharma, Neelesh B. Mehta
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350368741
DOIs
Publication statusPublished - 2025
Event2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Hyderabad, India
Duration: 6 Apr 202511 Apr 2025

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
Country/TerritoryIndia
CityHyderabad
Period6/04/2511/04/25

Keywords

  • frequency domain
  • small object detection
  • spatial domain

Fingerprint

Dive into the research topics of 'Dual-Domain Feature-Guided Task Alignment for Enhanced Small Object Detection'. Together they form a unique fingerprint.

Cite this