TY - JOUR
T1 - Enhanced online CAM
T2 - Single-stage weakly supervised semantic segmentation via collaborative guidance
AU - Zhang, Bingfeng
AU - Gao, Xuru
AU - Yu, Siyue
AU - Liu, Weifeng
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2024/12
Y1 - 2024/12
N2 - Weakly supervised semantic segmentation with image-level annotations usually adopts multi-stage approaches, where high-quality offline CAM is generated as pseudo labels for further training, leading to a complex training process. Instead, current single-stage approaches, directly learning to segment objects with online CAM from image-level supervision, are more elegant. The quality of CAM critically determines the final segmentation performance. However, how to generate high-quality online CAM has not been deeply studied in existing single-stage methods. In this paper, we propose a new single-stage framework to mine more relative target features for enhanced online CAM. Specifically, we design a novel Collaborative Guidance Mechanism, where a prior guidance block uses the original CAM to produce class-specific feature representations, improving the quality of online CAM. However, such a prior is sensitive to discriminative regions of objects. Thus, we further propose a prior fusion block, in which the online segmentation prediction and the original CAM are fused to strengthen the prior guidance. Extensive experiments show that our approach achieves new state-of-the-art performance on both PASCAL VOC 2012 and MS COCO 2014 datasets, outperforming recent single-stage methods by a clear margin. Code is available at https://github.com/1rua11/CGM
AB - Weakly supervised semantic segmentation with image-level annotations usually adopts multi-stage approaches, where high-quality offline CAM is generated as pseudo labels for further training, leading to a complex training process. Instead, current single-stage approaches, directly learning to segment objects with online CAM from image-level supervision, are more elegant. The quality of CAM critically determines the final segmentation performance. However, how to generate high-quality online CAM has not been deeply studied in existing single-stage methods. In this paper, we propose a new single-stage framework to mine more relative target features for enhanced online CAM. Specifically, we design a novel Collaborative Guidance Mechanism, where a prior guidance block uses the original CAM to produce class-specific feature representations, improving the quality of online CAM. However, such a prior is sensitive to discriminative regions of objects. Thus, we further propose a prior fusion block, in which the online segmentation prediction and the original CAM are fused to strengthen the prior guidance. Extensive experiments show that our approach achieves new state-of-the-art performance on both PASCAL VOC 2012 and MS COCO 2014 datasets, outperforming recent single-stage methods by a clear margin. Code is available at https://github.com/1rua11/CGM
KW - CAM
KW - Semantic segmentation
KW - Single-stage
KW - Weakly supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85199682260&partnerID=8YFLogxK
U2 - 10.1016/j.patcog.2024.110787
DO - 10.1016/j.patcog.2024.110787
M3 - Article
AN - SCOPUS:85199682260
SN - 0031-3203
VL - 156
JO - Pattern Recognition
JF - Pattern Recognition
M1 - 110787
ER -