TY - JOUR
T1 - A visual knowledge oriented approach for weakly supervised remote sensing object detection
AU - Zhang, Junjie
AU - Ye, Binfeng
AU - Zhang, Qiming
AU - Gong, Yongshun
AU - Lu, Jianfeng
AU - Zeng, Dan
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2024/9/7
Y1 - 2024/9/7
N2 - Weakly supervised learning poses significant challenges in remote sensing (RS) object detection due to the lack of precise instance annotations. This issue becomes particularly pronounced when dealing with complex backgrounds and dense target alignments in RS images. To address above limitations, we propose a visual knowledge oriented approach to leverage visual cues as pseudo labels, thereby enhancing the supervision for object detection. The visual knowledge is mainly explored from two perspectives: Firstly, recognizing that annotations are made solely at the image level, we address this limitation by aggregating objects of the same type across a group of images that share related semantic concepts. This approach allows us to infer instance-level annotations through collective knowledge. Secondly, due to the bird's-eye view of RS images, certain object categories display distinctive visual patterns that are identifiable via expert knowledge. Specifically, with the multi-instance self-training framework as our base model, we establish the correlation among images sharing the same class labels, the co-saliency is utilized to extract the regions of common interests, thereby obtaining initial foregrounds in each image. Moreover, by leveraging the expert knowledge of class-specific visual patterns, we refine the pseudo labels and strength the foreground feature extraction by incorporating the low-level visual cues. To further stabilize the training process and address potential noise in object proposals, we incorporate a two-stage training strategy to refine initial predictions. We validate the effectiveness of our proposed approach on two benchmark datasets, i.e. NWPU VHR-10.v2 and DIOR, and achieve mAP of 84.25% and 27.5% on these datasets, respectively, which significantly outperform trending methods.
AB - Weakly supervised learning poses significant challenges in remote sensing (RS) object detection due to the lack of precise instance annotations. This issue becomes particularly pronounced when dealing with complex backgrounds and dense target alignments in RS images. To address above limitations, we propose a visual knowledge oriented approach to leverage visual cues as pseudo labels, thereby enhancing the supervision for object detection. The visual knowledge is mainly explored from two perspectives: Firstly, recognizing that annotations are made solely at the image level, we address this limitation by aggregating objects of the same type across a group of images that share related semantic concepts. This approach allows us to infer instance-level annotations through collective knowledge. Secondly, due to the bird's-eye view of RS images, certain object categories display distinctive visual patterns that are identifiable via expert knowledge. Specifically, with the multi-instance self-training framework as our base model, we establish the correlation among images sharing the same class labels, the co-saliency is utilized to extract the regions of common interests, thereby obtaining initial foregrounds in each image. Moreover, by leveraging the expert knowledge of class-specific visual patterns, we refine the pseudo labels and strength the foreground feature extraction by incorporating the low-level visual cues. To further stabilize the training process and address potential noise in object proposals, we incorporate a two-stage training strategy to refine initial predictions. We validate the effectiveness of our proposed approach on two benchmark datasets, i.e. NWPU VHR-10.v2 and DIOR, and achieve mAP of 84.25% and 27.5% on these datasets, respectively, which significantly outperform trending methods.
KW - Co-saliency segmentation
KW - Expert knowledge
KW - Remote sensing images
KW - Visual knowledge
KW - Weakly-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85197385237&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2024.128114
DO - 10.1016/j.neucom.2024.128114
M3 - Article
AN - SCOPUS:85197385237
SN - 0925-2312
VL - 597
JO - Neurocomputing
JF - Neurocomputing
M1 - 128114
ER -