Vision transformer promotes cancer diagnosis: A comprehensive review

Xiaoyan Jiang*, Shuihua Wang, Yudong Zhang

*Corresponding author for this work

Research output: Contribution to journalReview articlepeer-review

Abstract

Background: The approaches based on vision transformers (ViTs) are advancing the field of medical artificial intelligence (AI) and cancer diagnosis. Recently, many researchers have developed artificial intelligence methods for cancer diagnosis based on ViTs. In this paper, 98 pertinent articles since 2020 were carefully chosen from digital databases, including Google scholar, Elsevier, and Springer Link, to review the research progress of artificial intelligence methods for cancer imaging based on ViT. Method: The basic structure of ViT is introduced, and corresponding modules such as patch embedding, positional embedding, transformer encoder, multi-head self-attention (MSA), layer normalization (LN), and residual connections, multilayer perceptron (MLP) are elaborated; a comprehensive review of improved ViT models in the medical field is presented. The application of ViT technology in cancer analysis based on medical images was reviewed. Results: ViT has achieved great success in cancer diagnosis based on medical images, showing its advantages in image classification, image reconstruction, image detection, image segmentation, image registration, image fusion, and other tasks. In these task studies, the most common task is cancer image classification and segmentation. There is still a lot of room for improvement in the aspects of multi-task learning, multi-modal learning, model generality, generalization ability, and explainability, and it also faces the mutual restriction of model scale and performance. Conclusion: The ViT training model for cancer diagnosis can potentially improve. The ViT model of self-supervised learning and semi-supervised learning mechanism is promising research. The lightweight attention module design, ViTs based on mobile networks, and the development of 3DViT will promote cancer diagnosis based on medical images to be more accurate and efficient.

Original languageEnglish
Article number124113
JournalExpert Systems with Applications
Volume252
DOIs
Publication statusPublished - 15 Oct 2024

Keywords

  • Breast cancer
  • Cancer diagnosis
  • Colorectal cancer
  • Lung cancer
  • Prostate cancer
  • Vision transformer

Fingerprint

Dive into the research topics of 'Vision transformer promotes cancer diagnosis: A comprehensive review'. Together they form a unique fingerprint.

Cite this