TY - JOUR
T1 - Deep learning based ultrasound analysis facilitates precise distinction between parotid pleomorphic adenoma and Warthin tumor
AU - Liu, Xi Hui
AU - Miao, Yi Yi
AU - Qian, Lang
AU - Shi, Zhao Ting
AU - Wang, Yu
AU - Su, Jiong Long
AU - Chang, Cai
AU - Chen, Jia Ying
AU - Chen, Jian Gang
AU - Li, Jia Wei
N1 - Publisher Copyright:
Copyright © 2024 Liu, Miao, Qian, Shi, Wang, Su, Chang, Chen, Chen and Li.
PY - 2024
Y1 - 2024
N2 - Background: Pleomorphic adenoma (PA), often with the benign-like imaging appearances similar to Warthin tumor (WT), however, is a potentially malignant tumor with a high recurrence rate. It is worse that pathological fine-needle aspiration cytology (FNAC) is difficult to distinguish PA and WT for inexperienced pathologists. This study employed deep learning (DL) technology, which effectively utilized ultrasound images, to provide a reliable approach for discriminating PA from WT. Methods: 488 surgically confirmed patients, including 266 with PA and 222 with WT, were enrolled in this study. Two experienced ultrasound physicians independently evaluated all images to differentiate between PA and WT. The diagnostic performance of preoperative FNAC was also evaluated. During the DL study, all ultrasound images were randomly divided into training (70%), validation (20%), and test (10%) sets. Furthermore, ultrasound images that could not be diagnosed by FNAC were also randomly allocated to training (60%), validation (20%), and test (20%) sets. Five DL models were developed to classify ultrasound images as PA or WT. The robustness of these models was assessed using five-fold cross-validation. The Gradient-weighted Class Activation Mapping (Grad-CAM) technique was employed to visualize the region of interest in the DL models. Results: In Grad-CAM analysis, the DL models accurately identified the mass as the region of interest. The area under the receiver operating characteristic curve (AUROC) of the two ultrasound physicians were 0.351 and 0.598, and FNAC achieved an AUROC of only 0.721. Meanwhile, for DL models, the AUROC value for discriminating between PA and WT in the test set was from 0.828 to 0.908. ResNet50 demonstrated the optimal performance with an AUROC of 0.908, an accuracy of 0.833, a sensitivity of 0.736, and a specificity of 0.904. In the test set of cases that FNAC failed to provide a diagnosis, DenseNet121 demonstrated the optimal performance with an AUROC of 0.897, an accuracy of 0.806, a sensitivity of 0.789, and a specificity of 0.824. Conclusion: For the discrimination of PA and WT, DL models are superior to ultrasound and FNAC, thereby facilitating surgeons in making informed decisions regarding the most appropriate surgical approach.
AB - Background: Pleomorphic adenoma (PA), often with the benign-like imaging appearances similar to Warthin tumor (WT), however, is a potentially malignant tumor with a high recurrence rate. It is worse that pathological fine-needle aspiration cytology (FNAC) is difficult to distinguish PA and WT for inexperienced pathologists. This study employed deep learning (DL) technology, which effectively utilized ultrasound images, to provide a reliable approach for discriminating PA from WT. Methods: 488 surgically confirmed patients, including 266 with PA and 222 with WT, were enrolled in this study. Two experienced ultrasound physicians independently evaluated all images to differentiate between PA and WT. The diagnostic performance of preoperative FNAC was also evaluated. During the DL study, all ultrasound images were randomly divided into training (70%), validation (20%), and test (10%) sets. Furthermore, ultrasound images that could not be diagnosed by FNAC were also randomly allocated to training (60%), validation (20%), and test (20%) sets. Five DL models were developed to classify ultrasound images as PA or WT. The robustness of these models was assessed using five-fold cross-validation. The Gradient-weighted Class Activation Mapping (Grad-CAM) technique was employed to visualize the region of interest in the DL models. Results: In Grad-CAM analysis, the DL models accurately identified the mass as the region of interest. The area under the receiver operating characteristic curve (AUROC) of the two ultrasound physicians were 0.351 and 0.598, and FNAC achieved an AUROC of only 0.721. Meanwhile, for DL models, the AUROC value for discriminating between PA and WT in the test set was from 0.828 to 0.908. ResNet50 demonstrated the optimal performance with an AUROC of 0.908, an accuracy of 0.833, a sensitivity of 0.736, and a specificity of 0.904. In the test set of cases that FNAC failed to provide a diagnosis, DenseNet121 demonstrated the optimal performance with an AUROC of 0.897, an accuracy of 0.806, a sensitivity of 0.789, and a specificity of 0.824. Conclusion: For the discrimination of PA and WT, DL models are superior to ultrasound and FNAC, thereby facilitating surgeons in making informed decisions regarding the most appropriate surgical approach.
KW - Warthin tumor
KW - deep learning
KW - diagnosis
KW - pleomorphic adenoma
KW - ultrasound
UR - http://www.scopus.com/inward/record.url?scp=85187195178&partnerID=8YFLogxK
U2 - 10.3389/fonc.2024.1337631
DO - 10.3389/fonc.2024.1337631
M3 - Article
AN - SCOPUS:85187195178
SN - 2234-943X
VL - 14
JO - Frontiers in Oncology
JF - Frontiers in Oncology
M1 - 1337631
ER -