TY - GEN
T1 - Diversity-oriented Contrastive Learning for RGB-T Scene Parsing
AU - Liu, Hengyan
AU - Ren, Guangyu
AU - Dai, Tianhong
AU - Zhang, Di
AU - Xu, Pengjing
AU - Zhang, Wenzhang
AU - Hu, Bintao
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2024/5/30
Y1 - 2024/5/30
N2 - Scene parsing gains improvement under poor lighting conditions by leveraging complementary information from thermal images. However, there are inherent gaps between different modalities and most existing methods propose network modules to reduce the gap. Contrastive learning has gained prominence in the field of computer vision by enabling models to capture rich feature representations through the comparison of positive and negative samples. To extend its applicability to scene parsing, we propose a unique method to enhance the effectiveness of semantic segmentation without negative samples. A determinantal point processes (DPP) based method is proposed to minimize the similarity between relevant image inputs, specifically focusing on learning the intrinsic features between RGB and its thermal images. We further consider the deployment of this task and introduce an efficient vision transformer as a backbone for feature extraction. Our final model achieves a reasonable balance between model size and accuracy.
AB - Scene parsing gains improvement under poor lighting conditions by leveraging complementary information from thermal images. However, there are inherent gaps between different modalities and most existing methods propose network modules to reduce the gap. Contrastive learning has gained prominence in the field of computer vision by enabling models to capture rich feature representations through the comparison of positive and negative samples. To extend its applicability to scene parsing, we propose a unique method to enhance the effectiveness of semantic segmentation without negative samples. A determinantal point processes (DPP) based method is proposed to minimize the similarity between relevant image inputs, specifically focusing on learning the intrinsic features between RGB and its thermal images. We further consider the deployment of this task and introduce an efficient vision transformer as a backbone for feature extraction. Our final model achieves a reasonable balance between model size and accuracy.
KW - Contrastive Learning
KW - Determinantal point processes
KW - Scene Parsing
UR - http://www.scopus.com/inward/record.url?scp=85195477309&partnerID=8YFLogxK
U2 - 10.1109/SMC-IoT62253.2023.00035
DO - 10.1109/SMC-IoT62253.2023.00035
M3 - Conference Proceeding
AN - SCOPUS:85195477309
T3 - International Conference on Sensing, Measurement, Communication and Internet of Things Technologies (SMC-IoT)
SP - 155
EP - 160
BT - Proceedings - 2023 2nd International Conference on Sensing, Measurement, Communication and Internet of Things Technologies, SMC-IoT 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2nd International Conference on Sensing, Measurement, Communication and Internet of Things Technologies, SMC-IoT 2023
Y2 - 29 December 2023 through 31 December 2023
ER -