Multimodal contrastive learning for radiology report generation

Xing Wu*, Jingwen Li, Jianjia Wang, Quan Qian

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

7 Citations (Scopus)

Abstract

Automated radiology report generation can not only lighten the workload of clinicians but also improve the efficiency of disease diagnosis. However, it is a challenging task to generate semantically coherent radiology reports that are also highly consistent with medical images. To meet the challenge, we propose a Multimodal Recursive model with Contrastive Learning (MRCL). The proposed MRCL method incorporates both visual and semantic features to generate “Impression” and “Findings” of radiology reports through a recursive network, in which a contrastive pre-training method is proposed to improve the expressiveness of both visual and textual representations. Extensive experiments and analyses prove the efficacy of the proposed MRCL, which can not only generate semantically coherent radiology reports but also outperform state-of-the-art methods.

Original languageEnglish
Pages (from-to)11185-11194
Number of pages10
JournalJournal of Ambient Intelligence and Humanized Computing
Volume14
Issue number8
DOIs
Publication statusPublished - Aug 2023
Externally publishedYes

Keywords

  • Contrastive learning
  • Multimodal recursive model
  • Radiology report generation
  • Semantic representation
  • Visual representation

Fingerprint

Dive into the research topics of 'Multimodal contrastive learning for radiology report generation'. Together they form a unique fingerprint.

Cite this