BC-PMJRS: A Brain Computing-inspired Predefined Multimodal Joint Representation Spaces for enhanced cross-modal learning

Jiahao Qin; Feng Liu; Lu Zong

doi:10.1016/j.neunet.2025.107449

BC-PMJRS: A Brain Computing-inspired Predefined Multimodal Joint Representation Spaces for enhanced cross-modal learning

Jiahao Qin, Feng Liu^*, Lu Zong

^*Corresponding author for this work

Department of Financial and Actuarial Mathematics

Research output: Contribution to journal › Article › peer-review

Abstract

Multimodal learning faces two key challenges: effectively fusing complex information from different modalities, and designing efficient mechanisms for cross-modal interactions. Inspired by neural plasticity and information processing principles in the human brain, this paper proposes BC-PMJRS, a Brain Computing-inspired Predefined Multimodal Joint Representation Spaces method to enhance cross-modal learning. The method learns the joint representation space through two complementary optimization objectives: (1) minimizing mutual information between representations of different modalities to reduce redundancy and (2) maximizing mutual information between joint representations and sentiment labels to improve task-specific discrimination. These objectives are balanced dynamically using an adaptive optimization strategy inspired by long-term potentiation (LTP) and long-term depression (LTD) mechanisms. Furthermore, we significantly reduce the computational complexity of modal interactions by leveraging a global–local cross-modal interaction mechanism, analogous to selective attention in the brain. Experimental results on the IEMOCAP, MOSI, and MOSEI datasets demonstrate that BC-PMJRS outperforms state-of-the-art models in both complete and incomplete modality settings, achieving up to a 1.9% improvement in weighted-F1 on IEMOCAP, a 2.8% gain in 7-class accuracy on MOSI, and a 2.9% increase in 7-class accuracy on MOSEI. These substantial improvements across multiple datasets demonstrate that incorporating brain-inspired mechanisms, particularly the dynamic balance of information redundancy and task relevance through neural plasticity principles, effectively enhances multimodal learning. This work bridges neuroscience principles with multimodal machine learning, offering new insights for developing more effective and biologically plausible models.

Original language	English
Article number	107449
Journal	Neural Networks
Volume	188
DOIs	https://doi.org/10.1016/j.neunet.2025.107449
Publication status	Published - Aug 2025

Keywords

Brain-inspired computing
Global–local cross-modal interaction
Joint representation learning
Multimodal sentiment analysis
Mutual information optimization
Neural plasticity

Access to Document

10.1016/j.neunet.2025.107449

Cite this

@article{a8bb1bb894ce40c0845c81c96f64f1f0,

title = "BC-PMJRS: A Brain Computing-inspired Predefined Multimodal Joint Representation Spaces for enhanced cross-modal learning",

abstract = "Multimodal learning faces two key challenges: effectively fusing complex information from different modalities, and designing efficient mechanisms for cross-modal interactions. Inspired by neural plasticity and information processing principles in the human brain, this paper proposes BC-PMJRS, a Brain Computing-inspired Predefined Multimodal Joint Representation Spaces method to enhance cross-modal learning. The method learns the joint representation space through two complementary optimization objectives: (1) minimizing mutual information between representations of different modalities to reduce redundancy and (2) maximizing mutual information between joint representations and sentiment labels to improve task-specific discrimination. These objectives are balanced dynamically using an adaptive optimization strategy inspired by long-term potentiation (LTP) and long-term depression (LTD) mechanisms. Furthermore, we significantly reduce the computational complexity of modal interactions by leveraging a global–local cross-modal interaction mechanism, analogous to selective attention in the brain. Experimental results on the IEMOCAP, MOSI, and MOSEI datasets demonstrate that BC-PMJRS outperforms state-of-the-art models in both complete and incomplete modality settings, achieving up to a 1.9% improvement in weighted-F1 on IEMOCAP, a 2.8% gain in 7-class accuracy on MOSI, and a 2.9% increase in 7-class accuracy on MOSEI. These substantial improvements across multiple datasets demonstrate that incorporating brain-inspired mechanisms, particularly the dynamic balance of information redundancy and task relevance through neural plasticity principles, effectively enhances multimodal learning. This work bridges neuroscience principles with multimodal machine learning, offering new insights for developing more effective and biologically plausible models.",

keywords = "Brain-inspired computing, Global–local cross-modal interaction, Joint representation learning, Multimodal sentiment analysis, Mutual information optimization, Neural plasticity",

author = "Jiahao Qin and Feng Liu and Lu Zong",

note = "Publisher Copyright: {\textcopyright} 2025 Elsevier Ltd",

year = "2025",

month = aug,

doi = "10.1016/j.neunet.2025.107449",

language = "English",

volume = "188",

journal = "Neural Networks",

issn = "0893-6080",

}

TY - JOUR

T1 - BC-PMJRS

T2 - A Brain Computing-inspired Predefined Multimodal Joint Representation Spaces for enhanced cross-modal learning

AU - Qin, Jiahao

AU - Liu, Feng

AU - Zong, Lu

PY - 2025/8

Y1 - 2025/8

N2 - Multimodal learning faces two key challenges: effectively fusing complex information from different modalities, and designing efficient mechanisms for cross-modal interactions. Inspired by neural plasticity and information processing principles in the human brain, this paper proposes BC-PMJRS, a Brain Computing-inspired Predefined Multimodal Joint Representation Spaces method to enhance cross-modal learning. The method learns the joint representation space through two complementary optimization objectives: (1) minimizing mutual information between representations of different modalities to reduce redundancy and (2) maximizing mutual information between joint representations and sentiment labels to improve task-specific discrimination. These objectives are balanced dynamically using an adaptive optimization strategy inspired by long-term potentiation (LTP) and long-term depression (LTD) mechanisms. Furthermore, we significantly reduce the computational complexity of modal interactions by leveraging a global–local cross-modal interaction mechanism, analogous to selective attention in the brain. Experimental results on the IEMOCAP, MOSI, and MOSEI datasets demonstrate that BC-PMJRS outperforms state-of-the-art models in both complete and incomplete modality settings, achieving up to a 1.9% improvement in weighted-F1 on IEMOCAP, a 2.8% gain in 7-class accuracy on MOSI, and a 2.9% increase in 7-class accuracy on MOSEI. These substantial improvements across multiple datasets demonstrate that incorporating brain-inspired mechanisms, particularly the dynamic balance of information redundancy and task relevance through neural plasticity principles, effectively enhances multimodal learning. This work bridges neuroscience principles with multimodal machine learning, offering new insights for developing more effective and biologically plausible models.

AB - Multimodal learning faces two key challenges: effectively fusing complex information from different modalities, and designing efficient mechanisms for cross-modal interactions. Inspired by neural plasticity and information processing principles in the human brain, this paper proposes BC-PMJRS, a Brain Computing-inspired Predefined Multimodal Joint Representation Spaces method to enhance cross-modal learning. The method learns the joint representation space through two complementary optimization objectives: (1) minimizing mutual information between representations of different modalities to reduce redundancy and (2) maximizing mutual information between joint representations and sentiment labels to improve task-specific discrimination. These objectives are balanced dynamically using an adaptive optimization strategy inspired by long-term potentiation (LTP) and long-term depression (LTD) mechanisms. Furthermore, we significantly reduce the computational complexity of modal interactions by leveraging a global–local cross-modal interaction mechanism, analogous to selective attention in the brain. Experimental results on the IEMOCAP, MOSI, and MOSEI datasets demonstrate that BC-PMJRS outperforms state-of-the-art models in both complete and incomplete modality settings, achieving up to a 1.9% improvement in weighted-F1 on IEMOCAP, a 2.8% gain in 7-class accuracy on MOSI, and a 2.9% increase in 7-class accuracy on MOSEI. These substantial improvements across multiple datasets demonstrate that incorporating brain-inspired mechanisms, particularly the dynamic balance of information redundancy and task relevance through neural plasticity principles, effectively enhances multimodal learning. This work bridges neuroscience principles with multimodal machine learning, offering new insights for developing more effective and biologically plausible models.

KW - Brain-inspired computing

KW - Global–local cross-modal interaction

KW - Joint representation learning

KW - Multimodal sentiment analysis

KW - Mutual information optimization

KW - Neural plasticity

UR - http://www.scopus.com/inward/record.url?scp=105002338191&partnerID=8YFLogxK

U2 - 10.1016/j.neunet.2025.107449

DO - 10.1016/j.neunet.2025.107449

M3 - Article

C2 - 40222152

AN - SCOPUS:105002338191

SN - 0893-6080

VL - 188

JO - Neural Networks

JF - Neural Networks

M1 - 107449

ER -

BC-PMJRS: A Brain Computing-inspired Predefined Multimodal Joint Representation Spaces for enhanced cross-modal learning

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this