Diabetes Analysis with a Dataset Using Machine Learning

Victor Chang*, Saiteja Javvaji, Qianwen Ariel Xu, Karl Hall, Steven Guan

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingChapterpeer-review


Diabetes is a disease that actually impacts the capacity of the body to obtain blood glucose, which is usually referred to as blood sugar. At the end of 2019, a new public health problem (COVID-19) emerged. This disease has greatly harmed people with diabetes. Therefore, we intend to make use of data mining algorithms to prevent death and improve the quality of life through the prediction of diabetes. In this paper, four different algorithms have been used to analyze Diabetes from DAT260x Lab01: Logistic, Decision Tree Classifier, Xgboost and SVC. The models are evaluated for which algorithm is much effective. The paper then provides a quick overview of both the set of data and the fieldwork carried out on the subject. In the adjoining step, the dataset and its features are discussed. In addition, the paper explains the four algorithms and virtual environments that have been used to clarify the variables, which have the largest impact on raw data. The findings are obtained by evaluating the confusion matrix applied to the whole selected algorithm. The paper outlines the full observations and conclusions taken based on the results.

Original languageEnglish
Title of host publicationStudies in Computational Intelligence
PublisherSpringer Science and Business Media Deutschland GmbH
Number of pages28
Publication statusPublished - 2022

Publication series

NameStudies in Computational Intelligence
ISSN (Print)1860-949X
ISSN (Electronic)1860-9503


  • Decision tree classifier
  • Logistic algorithm
  • Machine learning
  • SVC
  • Xgboost algorithm


Dive into the research topics of 'Diabetes Analysis with a Dataset Using Machine Learning'. Together they form a unique fingerprint.

Cite this