Decision tree application to classification problems with boosting algorithm

Long Zhao, Sanghyuk Lee, Seon Phil Jeong*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

34 Citations (Scopus)

Abstract

A personal credit evaluation algorithm is proposed by the design of a decision tree with a boosting algorithm, and the classification is carried out. By comparison with the conventional decision tree algorithm, it is shown that the boosting algorithm acts to speed up the processing time. The Classification and Regression Tree (CART) algorithm with the boosting algorithm showed 90.95% accuracy, slightly higher than without boosting, 90.31%. To avoid overfitting of the model on the training set due to unreasonable data set division, we consider cross-validation and illustrate the results with simulation; hypermeters of the model have been applied and the model fitting effect is verified. The proposed decision tree model is fitted optimally with the help of a confusion matrix. In this paper, relevant evaluation indicators are also introduced to evaluate the performance of the proposed model. For the comparison with the conventional methods, accuracy rate, error rate, precision, recall, etc. are also illustrated; we comprehensively evaluate the model performance based on the model accuracy after the 10-fold cross-validation. The results show that the boosting algorithm improves the performance of the model in accuracy and precision when CART is applied, but the model fitting time takes much longer, around 2 min. With the obtained result, it is verified that the performance of the decision tree model is improved under the boosting algorithm. At the same time, we test the performance of the proposed verification model with model fitting, and it could be applied to the prediction model for customers’ decisions on subscription to the fixed deposit business.

Original languageEnglish
Article number1903
JournalElectronics (Switzerland)
Volume10
Issue number16
DOIs
Publication statusPublished - 2 Aug 2021

Keywords

  • Boosting algorithm
  • Cross-validation
  • Decision tree
  • Receiver operating curve

Cite this