Abstract
Type 2 diabetes is a global disease issue and is one of leading causes of death. Current discovery indicates that this disease could be categorized into many sub-clusters, which is a step for precision medicine. In this paper, we aim to analyze and compare two approaches of data reduction, i.e. with and without principal component analysis (PCA) on the standardized and normalized data. Data preparation was first performed. The model was then developed and validated by plotting Elbow method and silhouette width graph. Normalized data with principal component (PC) of 2 gives the best clustering visualization, the lowest within cluster sum of squared (WCSS) score (195.41) and highest Silhouette score (0.3491) compared to using both standardized data and standardized data (PC=2) with 23518.82 (WCSS score) and 0.1976 (Silhouette score). We concluded that by integrating PCA with k-means clustering, the score value of WCSS shown to be lower while higher value recorded for Silhouette score.
| Original language | English |
|---|---|
| Article number | 070001 |
| Journal | AIP Conference Proceedings |
| Volume | 3056 |
| Issue number | 1 |
| DOIs | |
| Publication status | Published - 7 Apr 2025 |
| Externally published | Yes |
| Event | 2022 Sustainable and Integrated Engineering International Conference, SIE 2022 - Langkawi Island, Malaysia Duration: 12 Dec 2022 → 13 Dec 2022 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Fingerprint
Dive into the research topics of 'Integration of Principal Component Analysis and K-Means Clustering for Type 2 Diabetes Sub-clustering Model'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver