TY - GEN
T1 - Prediction of days in hospital for children using random forest
AU - Wang, Chenguang
AU - Dong, Xueling
AU - Yu, Limin
AU - Ye, Lishan
AU - Zhuang, Weifen
AU - Ma, Fei
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/7/2
Y1 - 2017/7/2
N2 - In this study, a method was developed to predict the number of hospitalization days of infant patients. The random forest algorithm, along with a data set consisted by records extracted from a hospital information system, was utilized to develop a model to predict the days in hospital. When half of randomly selected records was used as training set to train the random forest algorithm and the other half was used as testing set to test the trained model, the random forest method achieved good predictive accuracy with RMSE being 0.314, R2 being 0.706, |R| being 0.545, and Acc±1 being 71%, which is better than the results obtained by Adaboost method and Bagging method. Experiment on three subgroups of records: A group with all data, a group with records having less than or equal to 14 days in hospital, and a group with records having greater than 14 days in hospital, shows that the prediction of the developed method on the group having more than 14 days in hospital was better than predictions on other groups. Analysis to the importance of three different types of feature sets to the accuracy of prediction reveals that the feature set relating to personal information contribute more to the prediction than other types of features.
AB - In this study, a method was developed to predict the number of hospitalization days of infant patients. The random forest algorithm, along with a data set consisted by records extracted from a hospital information system, was utilized to develop a model to predict the days in hospital. When half of randomly selected records was used as training set to train the random forest algorithm and the other half was used as testing set to test the trained model, the random forest method achieved good predictive accuracy with RMSE being 0.314, R2 being 0.706, |R| being 0.545, and Acc±1 being 71%, which is better than the results obtained by Adaboost method and Bagging method. Experiment on three subgroups of records: A group with all data, a group with records having less than or equal to 14 days in hospital, and a group with records having greater than 14 days in hospital, shows that the prediction of the developed method on the group having more than 14 days in hospital was better than predictions on other groups. Analysis to the importance of three different types of feature sets to the accuracy of prediction reveals that the feature set relating to personal information contribute more to the prediction than other types of features.
UR - http://www.scopus.com/inward/record.url?scp=85047466983&partnerID=8YFLogxK
U2 - 10.1109/CISP-BMEI.2017.8302287
DO - 10.1109/CISP-BMEI.2017.8302287
M3 - Conference Proceeding
AN - SCOPUS:85047466983
T3 - Proceedings - 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, CISP-BMEI 2017
SP - 1
EP - 6
BT - Proceedings - 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, CISP-BMEI 2017
A2 - Li, Qingli
A2 - Wang, Lipo
A2 - Zhou, Mei
A2 - Sun, Li
A2 - Qiu, Song
A2 - Liu, Hongying
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, CISP-BMEI 2017
Y2 - 14 October 2017 through 16 October 2017
ER -