CSGI-Net: A Cross-Sample Graph Interaction Network for Multimodal Sentiment Analysis

  • Erlin Tian
  • , Shuai Zhao*
  • , Zuhe Li
  • , Haoran Chen
  • , Yifan Gao
  • , Yushan Pan
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

With the widespread application of multimodal data in sentiment analysis, effectively integrating information from different modalities to improve the accuracy and robustness of sentiment analysis has become a critical issue. Although current fusion methods using Transformer architectures have enhanced inter-modal interaction and alignment to some extent, challenges such as the neglect of intra-modal feature complexity and the imbalance in multimodal data optimization limit the full utilization of modality-specific information by multimodal models. To address these challenges, we propose a novel multimodal sentiment analysis model: Cross-Sample Graph Interaction Network (CSGI-Net). Specifically, CSGI-Net facilitates interaction and learning between each sample and its similar samples within the same modality, thereby capturing the common emotional characteristics among similar samples. During the training process, CSGI-Net quantifies and calculates the optimization differences between modalities and dynamically adjusts the optimization amplitude based on these differences, thereby providing under-optimized modalities with more opportunities for improvement. Experimental results demonstrate that CSGI-Net achieves superior performance on two major multimodal sentiment analysis datasets: CMU-MOSI and CMU-MOSEI.

Original languageEnglish
Article number3493
JournalElectronics (Switzerland)
Volume14
Issue number17
DOIs
Publication statusPublished - Sept 2025

Keywords

  • cross-sample graph interaction
  • graph convolutional networks
  • multimodal fusion
  • multimodal sentiment analysis

Fingerprint

Dive into the research topics of 'CSGI-Net: A Cross-Sample Graph Interaction Network for Multimodal Sentiment Analysis'. Together they form a unique fingerprint.

Cite this