TY - JOUR
T1 - Hematoma Expansion Prediction Based on SMOTE And XGBoost Algorithm
AU - Li, Yan
AU - Du, Chaonan
AU - Ge, Sikai
AU - Zhang, Ruonan
AU - Shao, Yiming
AU - Chen, Keyu
AU - Li, Zhepeng
AU - Ma, Fei
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/6
Y1 - 2024/6
N2 - Hematoma expansion (HE) is a high risky symptom with high rate of occurrence for patients who have undergone spontaneous intracerebral hemorrhage (ICH) after a major accident or illness. Correct prediction of the occurrence of HE in advance is critical to help the doctors to determine the next step medical treatment. Most existing studies focus only on the occurrence of HE within 6 h after the occurrence of ICH, while in reality a considerable number of patients have HE after the first 6 h but within 24 h. In this study, based on the medical doctors recommendation, we focus on prediction of the occurrence of HE within 24 h, as well as the occurrence of HE every 6 h within 24 h. Based on the demographics and computer tomography (CT) image extraction information, we used the XGBoost method to predict the occurrence of HE within 24 h. In this study, to solve the issue of highly imbalanced data set, which is a frequent case in medical data analysis, we used the SMOTE algorithm for data augmentation. To evaluate our method, we used a data set consisting of 582 patients records, and compared the results of proposed method as well as few machine learning methods. Our experiments show that XGBoost achieved the best prediction performance on the balanced dataset processed by the SMOTE algorithm with an accuracy of 0.82 and F1-score of 0.82. Moreover, our proposed method predicts the occurrence of HE within 6, 12, 18 and 24 h at the accuracy of 0.89, 0.82, 0.87 and 0.94, indicating that the HE occurrence within 24 h can be predicted accurately by the proposed method.
AB - Hematoma expansion (HE) is a high risky symptom with high rate of occurrence for patients who have undergone spontaneous intracerebral hemorrhage (ICH) after a major accident or illness. Correct prediction of the occurrence of HE in advance is critical to help the doctors to determine the next step medical treatment. Most existing studies focus only on the occurrence of HE within 6 h after the occurrence of ICH, while in reality a considerable number of patients have HE after the first 6 h but within 24 h. In this study, based on the medical doctors recommendation, we focus on prediction of the occurrence of HE within 24 h, as well as the occurrence of HE every 6 h within 24 h. Based on the demographics and computer tomography (CT) image extraction information, we used the XGBoost method to predict the occurrence of HE within 24 h. In this study, to solve the issue of highly imbalanced data set, which is a frequent case in medical data analysis, we used the SMOTE algorithm for data augmentation. To evaluate our method, we used a data set consisting of 582 patients records, and compared the results of proposed method as well as few machine learning methods. Our experiments show that XGBoost achieved the best prediction performance on the balanced dataset processed by the SMOTE algorithm with an accuracy of 0.82 and F1-score of 0.82. Moreover, our proposed method predicts the occurrence of HE within 6, 12, 18 and 24 h at the accuracy of 0.89, 0.82, 0.87 and 0.94, indicating that the HE occurrence within 24 h can be predicted accurately by the proposed method.
KW - Hematoma expansion
KW - Machine learning prediction
KW - SMOTE
KW - Unbalanced dataset
KW - XGBoost
UR - http://www.scopus.com/inward/record.url?scp=85196354869&partnerID=8YFLogxK
U2 - 10.1186/s12911-024-02561-9
DO - 10.1186/s12911-024-02561-9
M3 - Article
C2 - 38898499
AN - SCOPUS:85196354869
SN - 1472-6947
VL - 24
JO - BMC Medical Informatics and Decision Making
JF - BMC Medical Informatics and Decision Making
IS - 1
M1 - 172
ER -