TY - JOUR
T1 - A Pyramid Attention Network With Edge Information Injection for Remote-Sensing Object Detection
AU - Zhang, Junjie
AU - Ding, Anqi
AU - Li, Guanyi
AU - Zhang, Liangang
AU - Zeng, Dan
N1 - Publisher Copyright:
© 2004-2012 IEEE.
PY - 2023
Y1 - 2023
N2 - Remote-sensing images (RSIs) are often characterized by high spatial resolution, strong object scale effects, and complex scenes, which pose great challenges to object detection. Although mainstream neural network-based methods work well in detecting common objects, they often fail to fully exploit the detailed structural information in the spatial domain, leading to poor performance for objects with diverse scales and distributions under complicated backgrounds. To address the above issue, we propose a pyramid attention network with edge information injection for remote-sensing object detection (RSOD). Considering each object is composed of the inner body and outer profile parts that correspond to the low- and high-frequency (LF and HF) components of the image, respectively, the difference between the original image and its LF component is beneficial for obtaining the HF counterpart. We design the edge information extraction module (EIEM) to mine the detailed edge features at multiple scales and subsequently inject them into features at corresponding scales in the backbone network. As for promoting the performance in complex scenes, we introduce a pyramid feature fusion (PFF) module, which leverages both local and global attention for establishing the long-range channel dependency, thereby highlighting objects that need to be concentrated on. To verify the effectiveness of our proposed method, we conduct extensive experiments on detection in optical remote sensing images (DIOR) and RSOD datasets with mean average precision (mAP) reaching 74.93% and 96.44%, respectively, demonstrating that our model achieved state-of-the-art (SOTA) performance compared to mainstream methods.
AB - Remote-sensing images (RSIs) are often characterized by high spatial resolution, strong object scale effects, and complex scenes, which pose great challenges to object detection. Although mainstream neural network-based methods work well in detecting common objects, they often fail to fully exploit the detailed structural information in the spatial domain, leading to poor performance for objects with diverse scales and distributions under complicated backgrounds. To address the above issue, we propose a pyramid attention network with edge information injection for remote-sensing object detection (RSOD). Considering each object is composed of the inner body and outer profile parts that correspond to the low- and high-frequency (LF and HF) components of the image, respectively, the difference between the original image and its LF component is beneficial for obtaining the HF counterpart. We design the edge information extraction module (EIEM) to mine the detailed edge features at multiple scales and subsequently inject them into features at corresponding scales in the backbone network. As for promoting the performance in complex scenes, we introduce a pyramid feature fusion (PFF) module, which leverages both local and global attention for establishing the long-range channel dependency, thereby highlighting objects that need to be concentrated on. To verify the effectiveness of our proposed method, we conduct extensive experiments on detection in optical remote sensing images (DIOR) and RSOD datasets with mean average precision (mAP) reaching 74.93% and 96.44%, respectively, demonstrating that our model achieved state-of-the-art (SOTA) performance compared to mainstream methods.
KW - Edge information injection
KW - object detection
KW - pyramid feature extraction (PFF)
UR - http://www.scopus.com/inward/record.url?scp=85164730437&partnerID=8YFLogxK
U2 - 10.1109/LGRS.2023.3294395
DO - 10.1109/LGRS.2023.3294395
M3 - Article
AN - SCOPUS:85164730437
SN - 1545-598X
VL - 20
JO - IEEE Geoscience and Remote Sensing Letters
JF - IEEE Geoscience and Remote Sensing Letters
M1 - 6007205
ER -