Representation distribution matching and dynamic routing interaction for multimodal sentiment analysis

Zuhe Li, Zhenwei Huang, Xiaojiang He, Jun Yu, Haoran Chen, Chenguang Yang, Yushan Pan*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

To address the challenges of distribution discrepancies between modalities, underutilization of representations during fusion, and homogenization of fused representations in cross-modal interactions, we introduce a cutting-edge multimodal sentiment analysis (MSA) framework called representation distribution matching interaction to extract and interpret emotional cues from video data. This framework includes a representation distribution matching module that uses an adversarial cyclic translation network. This aligns the representation distributions of nontextual modalities with those of textual modalities, preserving semantic information while reducing distribution gaps. We also developed the dynamic routing interaction module, which combines four distinct components to form a routing interaction space. This setup efficiently uses modality representations for a more effective emotional learning. To combat homogenization, we propose the cross-modal interaction optimization mechanism. It maximizes differences in fused representations and enhances mutual information with target modalities, yielding more discriminative fused representations. Our extensive experiments on the MOSI and MOSEI datasets confirm the effectiveness of our MSA framework.

Original languageEnglish
Article number113376
JournalKnowledge-Based Systems
Volume316
DOIs
Publication statusPublished - 12 May 2025

Keywords

  • Cross-modal interaction optimization
  • Distribution matching
  • Multimodal sentiment analysis
  • Route network

Fingerprint

Dive into the research topics of 'Representation distribution matching and dynamic routing interaction for multimodal sentiment analysis'. Together they form a unique fingerprint.

Cite this