Multimodal Frequeny Spectrum Fusion Schema for RGB-T Image Semantic Segmentation

Hengyan Liu*, Wenzhang Zhang, Tianhong Dai, Longfei Yin, Guangyu Ren

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

1 Citation (Scopus)

Abstract

Semantic segmentation confronts challenges with traditional networks tailored exclusively for RGB inputs, which may suffer from quality degradation under adverse conditions like low-level illumination or inclement weather. Recent advancements have shown promising outcomes by integrating RGB images with corresponding thermal infrared (TIR) images. However, effectively fusing features from both modalities remains a significant challenge. In this paper, we introduce a novel approach termed Multimodal Frequency Spectrum Fusion Schema (MFSFS) for semantic segmentation of RGB-T images. MFSFS leverages the advantages of the frequency spectrum to effectively extract and utilize multimodal feature information. To mitigate redundant information's adverse effects during multimodal fusion in the frequency domain, we propose a diversity-oriented contrastive learning approach. Simulation results demonstrate that MFSFS achieves competitive performance while maintaining a relatively smaller model size.

Original languageEnglish
Title of host publicationICCCN 2024 - 2024 33rd International Conference on Computer Communications and Networks
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350384611
DOIs
Publication statusPublished - 2024
Event33rd International Conference on Computer Communications and Networks, ICCCN 2024 - Big Island, United States
Duration: 29 Jul 202431 Jul 2024

Publication series

NameProceedings - International Conference on Computer Communications and Networks, ICCCN
ISSN (Print)1095-2055

Conference

Conference33rd International Conference on Computer Communications and Networks, ICCCN 2024
Country/TerritoryUnited States
CityBig Island
Period29/07/2431/07/24

Keywords

  • Contrastive Learning
  • Determinantal point processes
  • Frequency Spectrum
  • Multimodal Fusion
  • Semantic Segmentation

Fingerprint

Dive into the research topics of 'Multimodal Frequeny Spectrum Fusion Schema for RGB-T Image Semantic Segmentation'. Together they form a unique fingerprint.

Cite this