TY - GEN
T1 - MSFF-FCOS
T2 - 6th International Conference on Natural Language Processing, ICNLP 2024
AU - Xu, Sidi
AU - Wang, Dianwei
AU - Fang, Jie
AU - Li, Yuanqing
AU - Xu, Zhijie
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Objects detection in unmanned aerial vehicles (UAVs) imagery plays an important role in environmental monitoring, post-disaster rescue and other fields. However, detecting objects in aerial images is a challenging task due to the density of small objects with little detailed information. To address these issues, we propose a new object detection method based on improved FCOS with multi-scale features feature interaction for UAV aerial images (MSFF-FCOS). Firstly, an innovative feature extraction module is proposed to preserve the effective information of small objects through Contextual Transformer module (CoTM), and its capability to enhance the feature representation of the small objects. And then, a new feature fusion network is designed which incorporate parallel dilated convolution modules to get multi-scale contextual information. Finally, the Dual Weighting Label Assignment (DWLA) method has be used to improve the positional accuracy of small targets in UAV aerial images and can reduce missed detections. The experimental results show that the average precision (AP) of the proposed method reaches 24.0%, which is 3.4% higher than that of the original FCOS method in the VisDrone2019 dataset. Furthermore, the method has a better generalization performance on our self-picked dataset.
AB - Objects detection in unmanned aerial vehicles (UAVs) imagery plays an important role in environmental monitoring, post-disaster rescue and other fields. However, detecting objects in aerial images is a challenging task due to the density of small objects with little detailed information. To address these issues, we propose a new object detection method based on improved FCOS with multi-scale features feature interaction for UAV aerial images (MSFF-FCOS). Firstly, an innovative feature extraction module is proposed to preserve the effective information of small objects through Contextual Transformer module (CoTM), and its capability to enhance the feature representation of the small objects. And then, a new feature fusion network is designed which incorporate parallel dilated convolution modules to get multi-scale contextual information. Finally, the Dual Weighting Label Assignment (DWLA) method has be used to improve the positional accuracy of small targets in UAV aerial images and can reduce missed detections. The experimental results show that the average precision (AP) of the proposed method reaches 24.0%, which is 3.4% higher than that of the original FCOS method in the VisDrone2019 dataset. Furthermore, the method has a better generalization performance on our self-picked dataset.
KW - Contextual Transformer module
KW - improved FCOS
KW - object detection
KW - parallel dilated convolution
KW - UAV aerial imagery
UR - http://www.scopus.com/inward/record.url?scp=85207650029&partnerID=8YFLogxK
U2 - 10.1109/ICNLP60986.2024.10692882
DO - 10.1109/ICNLP60986.2024.10692882
M3 - Conference Proceeding
AN - SCOPUS:85207650029
T3 - 2024 6th International Conference on Natural Language Processing, ICNLP 2024
SP - 515
EP - 519
BT - 2024 6th International Conference on Natural Language Processing, ICNLP 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 22 March 2024 through 24 March 2024
ER -