TY - JOUR
T1 - Multimodal contrastive learning for radiology report generation
AU - Wu, Xing
AU - Li, Jingwen
AU - Wang, Jianjia
AU - Qian, Quan
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
PY - 2023/8
Y1 - 2023/8
N2 - Automated radiology report generation can not only lighten the workload of clinicians but also improve the efficiency of disease diagnosis. However, it is a challenging task to generate semantically coherent radiology reports that are also highly consistent with medical images. To meet the challenge, we propose a Multimodal Recursive model with Contrastive Learning (MRCL). The proposed MRCL method incorporates both visual and semantic features to generate “Impression” and “Findings” of radiology reports through a recursive network, in which a contrastive pre-training method is proposed to improve the expressiveness of both visual and textual representations. Extensive experiments and analyses prove the efficacy of the proposed MRCL, which can not only generate semantically coherent radiology reports but also outperform state-of-the-art methods.
AB - Automated radiology report generation can not only lighten the workload of clinicians but also improve the efficiency of disease diagnosis. However, it is a challenging task to generate semantically coherent radiology reports that are also highly consistent with medical images. To meet the challenge, we propose a Multimodal Recursive model with Contrastive Learning (MRCL). The proposed MRCL method incorporates both visual and semantic features to generate “Impression” and “Findings” of radiology reports through a recursive network, in which a contrastive pre-training method is proposed to improve the expressiveness of both visual and textual representations. Extensive experiments and analyses prove the efficacy of the proposed MRCL, which can not only generate semantically coherent radiology reports but also outperform state-of-the-art methods.
KW - Contrastive learning
KW - Multimodal recursive model
KW - Radiology report generation
KW - Semantic representation
KW - Visual representation
UR - http://www.scopus.com/inward/record.url?scp=85137807778&partnerID=8YFLogxK
U2 - 10.1007/s12652-022-04398-4
DO - 10.1007/s12652-022-04398-4
M3 - Article
AN - SCOPUS:85137807778
SN - 1868-5137
VL - 14
SP - 11185
EP - 11194
JO - Journal of Ambient Intelligence and Humanized Computing
JF - Journal of Ambient Intelligence and Humanized Computing
IS - 8
ER -