Hierarchical denoising representation disentanglement and dual-channel cross-modal-context interaction for multimodal sentiment analysis

Zuhe Li, Zhenwei Huang, Yushan Pan*, Jun Yu, Weihua Liu, Haoran Chen, Yiming Luo, Di Wu, Hao Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Multimodal sentiment analysis aims to extract sentiment cues from various modalities, such as textual, acoustic, and visual data, and manipulate them to determine the inherent sentiment polarity in the data. Despite significant achievements in multimodal sentiment analysis, challenges persist in addressing noise features in modal representations, eliminating substantial gaps in sentiment information among modal representations, and exploring contextual information that expresses different sentiments between modalities. To tackle these challenges, our paper proposes a new Multimodal Sentiment Analysis (MSA) framework. Firstly, we introduce the Hierarchical Denoising Representation Disentanglement module (HDRD), which employs hierarchical disentanglement techniques. This ensures the extraction of both common and private sentiment information while eliminating interference noise from modal representations. Furthermore, to address the uneven distribution of sentiment information among modalities, our Inter-Modal Representation Enhancement module (IMRE) enhances non-textual representations by extracting sentiment information related to non-textual representations from textual representations. Next, we introduce a new interaction mechanism, the Dual-Channel Cross-Modal Context Interaction module (DCCMCI). This module not only mines correlated contextual sentiment information within modalities but also explores positive and negative correlation contextual sentiment information between modalities. We conducted extensive experiments on two benchmark datasets, MOSI and MOSEI, and the results indicate that our proposed method offers state-of-the-art approaches.

Original languageEnglish
Article number124236
JournalExpert Systems with Applications
Volume252
DOIs
Publication statusPublished - 15 Oct 2024

Keywords

  • Cross-modal context interaction
  • Hierarchical disentanglement
  • Inter-modal enhancement
  • Multimodal sentiment analysis

Fingerprint

Dive into the research topics of 'Hierarchical denoising representation disentanglement and dual-channel cross-modal-context interaction for multimodal sentiment analysis'. Together they form a unique fingerprint.

Cite this