TY - JOUR
T1 - Vision transformer promotes cancer diagnosis
T2 - A comprehensive review
AU - Jiang, Xiaoyan
AU - Wang, Shuihua
AU - Zhang, Yudong
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2024/10/15
Y1 - 2024/10/15
N2 - Background: The approaches based on vision transformers (ViTs) are advancing the field of medical artificial intelligence (AI) and cancer diagnosis. Recently, many researchers have developed artificial intelligence methods for cancer diagnosis based on ViTs. In this paper, 98 pertinent articles since 2020 were carefully chosen from digital databases, including Google scholar, Elsevier, and Springer Link, to review the research progress of artificial intelligence methods for cancer imaging based on ViT. Method: The basic structure of ViT is introduced, and corresponding modules such as patch embedding, positional embedding, transformer encoder, multi-head self-attention (MSA), layer normalization (LN), and residual connections, multilayer perceptron (MLP) are elaborated; a comprehensive review of improved ViT models in the medical field is presented. The application of ViT technology in cancer analysis based on medical images was reviewed. Results: ViT has achieved great success in cancer diagnosis based on medical images, showing its advantages in image classification, image reconstruction, image detection, image segmentation, image registration, image fusion, and other tasks. In these task studies, the most common task is cancer image classification and segmentation. There is still a lot of room for improvement in the aspects of multi-task learning, multi-modal learning, model generality, generalization ability, and explainability, and it also faces the mutual restriction of model scale and performance. Conclusion: The ViT training model for cancer diagnosis can potentially improve. The ViT model of self-supervised learning and semi-supervised learning mechanism is promising research. The lightweight attention module design, ViTs based on mobile networks, and the development of 3DViT will promote cancer diagnosis based on medical images to be more accurate and efficient.
AB - Background: The approaches based on vision transformers (ViTs) are advancing the field of medical artificial intelligence (AI) and cancer diagnosis. Recently, many researchers have developed artificial intelligence methods for cancer diagnosis based on ViTs. In this paper, 98 pertinent articles since 2020 were carefully chosen from digital databases, including Google scholar, Elsevier, and Springer Link, to review the research progress of artificial intelligence methods for cancer imaging based on ViT. Method: The basic structure of ViT is introduced, and corresponding modules such as patch embedding, positional embedding, transformer encoder, multi-head self-attention (MSA), layer normalization (LN), and residual connections, multilayer perceptron (MLP) are elaborated; a comprehensive review of improved ViT models in the medical field is presented. The application of ViT technology in cancer analysis based on medical images was reviewed. Results: ViT has achieved great success in cancer diagnosis based on medical images, showing its advantages in image classification, image reconstruction, image detection, image segmentation, image registration, image fusion, and other tasks. In these task studies, the most common task is cancer image classification and segmentation. There is still a lot of room for improvement in the aspects of multi-task learning, multi-modal learning, model generality, generalization ability, and explainability, and it also faces the mutual restriction of model scale and performance. Conclusion: The ViT training model for cancer diagnosis can potentially improve. The ViT model of self-supervised learning and semi-supervised learning mechanism is promising research. The lightweight attention module design, ViTs based on mobile networks, and the development of 3DViT will promote cancer diagnosis based on medical images to be more accurate and efficient.
KW - Breast cancer
KW - Cancer diagnosis
KW - Colorectal cancer
KW - Lung cancer
KW - Prostate cancer
KW - Vision transformer
UR - http://www.scopus.com/inward/record.url?scp=85191815479&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2024.124113
DO - 10.1016/j.eswa.2024.124113
M3 - Review article
AN - SCOPUS:85191815479
SN - 0957-4174
VL - 252
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 124113
ER -