Diversity-oriented Contrastive Learning for RGB-T Scene Parsing

Hengyan Liu*, Guangyu Ren, Tianhong Dai, Di Zhang, Pengjing Xu, Wenzhang Zhang, Bintao Hu*

*Corresponding author for this work

Research output: Contribution to conferencePaperpeer-review

Abstract

Scene parsing gains improvement under poor
lighting conditions by leveraging complementary information
from thermal images. However, there are inherent gaps between different modalities and most existing methods propose
network modules to reduce the gap. Contrastive learning has
gained prominence in the field of computer vision by enabling
models to capture rich feature representations through the
comparison of positive and negative samples. To extend its
applicability to scene parsing, we propose a unique method
to enhance the effectiveness of semantic segmentation without
negative samples. A determinantal point processes (DPP) based
method is proposed to minimize the similarity between relevant
image inputs, specifically focusing on learning the intrinsic
features between RGB and its thermal images. We further
consider the deployment of this task and introduce an efficient
vision transformer as a backbone for feature extraction. Our
final model achieves a reasonable balance between model size
and accuracy.
Original languageEnglish
Publication statusAccepted/In press - 2023
Event
The 2nd International Conference on Sensing, Measurement, Communication and Internet of Things Technologies
-
Duration: 29 Dec 202331 Dec 2023

Conference

Conference
The 2nd International Conference on Sensing, Measurement, Communication and Internet of Things Technologies
Period29/12/2331/12/23

Fingerprint

Dive into the research topics of 'Diversity-oriented Contrastive Learning for RGB-T Scene Parsing'. Together they form a unique fingerprint.

Cite this