TY - GEN
T1 - Multimodal Frequeny Spectrum Fusion Schema for RGB-T Image Semantic Segmentation
AU - Liu, Hengyan
AU - Zhang, Wenzhang
AU - Dai, Tianhong
AU - Yin, Longfei
AU - Ren, Guangyu
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Semantic segmentation confronts challenges with traditional networks tailored exclusively for RGB inputs, which may suffer from quality degradation under adverse conditions like low-level illumination or inclement weather. Recent advancements have shown promising outcomes by integrating RGB images with corresponding thermal infrared (TIR) images. However, effectively fusing features from both modalities remains a significant challenge. In this paper, we introduce a novel approach termed Multimodal Frequency Spectrum Fusion Schema (MFSFS) for semantic segmentation of RGB-T images. MFSFS leverages the advantages of the frequency spectrum to effectively extract and utilize multimodal feature information. To mitigate redundant information's adverse effects during multimodal fusion in the frequency domain, we propose a diversity-oriented contrastive learning approach. Simulation results demonstrate that MFSFS achieves competitive performance while maintaining a relatively smaller model size.
AB - Semantic segmentation confronts challenges with traditional networks tailored exclusively for RGB inputs, which may suffer from quality degradation under adverse conditions like low-level illumination or inclement weather. Recent advancements have shown promising outcomes by integrating RGB images with corresponding thermal infrared (TIR) images. However, effectively fusing features from both modalities remains a significant challenge. In this paper, we introduce a novel approach termed Multimodal Frequency Spectrum Fusion Schema (MFSFS) for semantic segmentation of RGB-T images. MFSFS leverages the advantages of the frequency spectrum to effectively extract and utilize multimodal feature information. To mitigate redundant information's adverse effects during multimodal fusion in the frequency domain, we propose a diversity-oriented contrastive learning approach. Simulation results demonstrate that MFSFS achieves competitive performance while maintaining a relatively smaller model size.
KW - Contrastive Learning
KW - Determinantal point processes
KW - Frequency Spectrum
KW - Multimodal Fusion
KW - Semantic Segmentation
UR - http://www.scopus.com/inward/record.url?scp=85203269718&partnerID=8YFLogxK
U2 - 10.1109/ICCCN61486.2024.10637614
DO - 10.1109/ICCCN61486.2024.10637614
M3 - Conference Proceeding
AN - SCOPUS:85203269718
T3 - Proceedings - International Conference on Computer Communications and Networks, ICCCN
BT - ICCCN 2024 - 2024 33rd International Conference on Computer Communications and Networks
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 33rd International Conference on Computer Communications and Networks, ICCCN 2024
Y2 - 29 July 2024 through 31 July 2024
ER -