Dual-Domain Feature-Guided Task Alignment for Enhanced Small Object Detection

Fangrui Guo; Junwei Wu; Quan Zhang

doi:10.1109/ICASSP49660.2025.10888564

Dual-Domain Feature-Guided Task Alignment for Enhanced Small Object Detection

Fangrui Guo, Junwei Wu, Quan Zhang^*

^*Corresponding author for this work

Department of Mechatronics and Robotics

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

Abstract

Small object detection is a critical challenge in Unmanned Aerial Vehicles (UAVs) due to the limited pixel representation of small objects and the impact of successive pooling operations, which frequently results in the disappearance of small objects within intricate backgrounds. To tackle this issue, we propose the Small Object Enhancement Pyramid (SOEP) module, which first transforms feature representations (i.e., in the spatial domain) into the frequency domain to better capture small objects typically characterized by high-frequency components. These feature representations are then fused in the spatial domain using a frequency-based attention map, enhancing small object representations by integrating information from both complementary domains. Furthermore, we introduce a Task Aligned Head (TAH) that integrates classification and localization tasks interactively, reducing the misalignment that occurs when these tasks are learned independently, particularly in the context of small objects. Experimental results on the Visdrone dataset verify that our proposed method (D²FTA) outperforms the baseline method by 12.7%, 14.19% on mAP0.5 and mAP0.5:0.95.

Original language	English
Title of host publication	2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Proceedings
Editors	Bhaskar D Rao, Isabel Trancoso, Gaurav Sharma, Neelesh B. Mehta
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)	9798350368741
DOIs	https://doi.org/10.1109/ICASSP49660.2025.10888564
Publication status	Published - 2025
Event	2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Hyderabad, India Duration: 6 Apr 2025 → 11 Apr 2025

Publication series

Name	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)	1520-6149

Conference

Conference	2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
Country/Territory	India
City	Hyderabad
Period	6/04/25 → 11/04/25

Keywords

frequency domain
small object detection
spatial domain

Access to Document

10.1109/ICASSP49660.2025.10888564

Cite this

Guo, F., Wu, J., & Zhang, Q. (2025). Dual-Domain Feature-Guided Task Alignment for Enhanced Small Object Detection. In B. D. Rao, I. Trancoso, G. Sharma, & N. B. Mehta (Eds.), 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Proceedings (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP49660.2025.10888564

Guo, Fangrui ; Wu, Junwei ; Zhang, Quan. / Dual-Domain Feature-Guided Task Alignment for Enhanced Small Object Detection. 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Proceedings. editor / Bhaskar D Rao ; Isabel Trancoso ; Gaurav Sharma ; Neelesh B. Mehta. Institute of Electrical and Electronics Engineers Inc., 2025. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

@inproceedings{c640f3c68f6f4ae182d67dee9902caa4,

title = "Dual-Domain Feature-Guided Task Alignment for Enhanced Small Object Detection",

abstract = "Small object detection is a critical challenge in Unmanned Aerial Vehicles (UAVs) due to the limited pixel representation of small objects and the impact of successive pooling operations, which frequently results in the disappearance of small objects within intricate backgrounds. To tackle this issue, we propose the Small Object Enhancement Pyramid (SOEP) module, which first transforms feature representations (i.e., in the spatial domain) into the frequency domain to better capture small objects typically characterized by high-frequency components. These feature representations are then fused in the spatial domain using a frequency-based attention map, enhancing small object representations by integrating information from both complementary domains. Furthermore, we introduce a Task Aligned Head (TAH) that integrates classification and localization tasks interactively, reducing the misalignment that occurs when these tasks are learned independently, particularly in the context of small objects. Experimental results on the Visdrone dataset verify that our proposed method (D2FTA) outperforms the baseline method by 12.7%, 14.19% on mAP0.5 and mAP0.5:0.95.",

keywords = "frequency domain, small object detection, spatial domain",

author = "Fangrui Guo and Junwei Wu and Quan Zhang",

note = "Publisher Copyright: {\textcopyright} 2025 IEEE.; 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 ; Conference date: 06-04-2025 Through 11-04-2025",

year = "2025",

doi = "10.1109/ICASSP49660.2025.10888564",

language = "English",

series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

editor = "Rao, {Bhaskar D} and Isabel Trancoso and Gaurav Sharma and Mehta, {Neelesh B.}",

booktitle = "2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Proceedings",

}

Guo, F, Wu, J & Zhang, Q 2025, Dual-Domain Feature-Guided Task Alignment for Enhanced Small Object Detection. in BD Rao, I Trancoso, G Sharma & NB Mehta (eds), 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Proceedings. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025, Hyderabad, India, 6/04/25. https://doi.org/10.1109/ICASSP49660.2025.10888564

Dual-Domain Feature-Guided Task Alignment for Enhanced Small Object Detection. / Guo, Fangrui; Wu, Junwei; Zhang, Quan.
2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Proceedings. ed. / Bhaskar D Rao; Isabel Trancoso; Gaurav Sharma; Neelesh B. Mehta. Institute of Electrical and Electronics Engineers Inc., 2025. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - Dual-Domain Feature-Guided Task Alignment for Enhanced Small Object Detection

AU - Guo, Fangrui

AU - Wu, Junwei

AU - Zhang, Quan

PY - 2025

Y1 - 2025

N2 - Small object detection is a critical challenge in Unmanned Aerial Vehicles (UAVs) due to the limited pixel representation of small objects and the impact of successive pooling operations, which frequently results in the disappearance of small objects within intricate backgrounds. To tackle this issue, we propose the Small Object Enhancement Pyramid (SOEP) module, which first transforms feature representations (i.e., in the spatial domain) into the frequency domain to better capture small objects typically characterized by high-frequency components. These feature representations are then fused in the spatial domain using a frequency-based attention map, enhancing small object representations by integrating information from both complementary domains. Furthermore, we introduce a Task Aligned Head (TAH) that integrates classification and localization tasks interactively, reducing the misalignment that occurs when these tasks are learned independently, particularly in the context of small objects. Experimental results on the Visdrone dataset verify that our proposed method (D2FTA) outperforms the baseline method by 12.7%, 14.19% on mAP0.5 and mAP0.5:0.95.

AB - Small object detection is a critical challenge in Unmanned Aerial Vehicles (UAVs) due to the limited pixel representation of small objects and the impact of successive pooling operations, which frequently results in the disappearance of small objects within intricate backgrounds. To tackle this issue, we propose the Small Object Enhancement Pyramid (SOEP) module, which first transforms feature representations (i.e., in the spatial domain) into the frequency domain to better capture small objects typically characterized by high-frequency components. These feature representations are then fused in the spatial domain using a frequency-based attention map, enhancing small object representations by integrating information from both complementary domains. Furthermore, we introduce a Task Aligned Head (TAH) that integrates classification and localization tasks interactively, reducing the misalignment that occurs when these tasks are learned independently, particularly in the context of small objects. Experimental results on the Visdrone dataset verify that our proposed method (D2FTA) outperforms the baseline method by 12.7%, 14.19% on mAP0.5 and mAP0.5:0.95.

KW - frequency domain

KW - small object detection

KW - spatial domain

UR - http://www.scopus.com/inward/record.url?scp=105003877954&partnerID=8YFLogxK

U2 - 10.1109/ICASSP49660.2025.10888564

DO - 10.1109/ICASSP49660.2025.10888564

M3 - Conference Proceeding

AN - SCOPUS:105003877954

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

BT - 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Proceedings

A2 - Rao, Bhaskar D

A2 - Trancoso, Isabel

A2 - Sharma, Gaurav

A2 - Mehta, Neelesh B.

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025

Y2 - 6 April 2025 through 11 April 2025

ER -

Guo F, Wu J, Zhang Q. Dual-Domain Feature-Guided Task Alignment for Enhanced Small Object Detection. In Rao BD, Trancoso I, Sharma G, Mehta NB, editors, 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2025. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). doi: 10.1109/ICASSP49660.2025.10888564

Dual-Domain Feature-Guided Task Alignment for Enhanced Small Object Detection

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this