TY - JOUR
T1 - Empowering multi-class medical data classification by Group-of-Single-Class-predictors and transfer optimization
T2 - Cases of structured dataset by machine learning and radiological images by deep learning
AU - Li, Tengyue
AU - Fong, Simon
AU - Mohammed, Sabah
AU - Fiaidhi, Jinan
AU - Guan, Steven
AU - Chang, Victor
N1 - Publisher Copyright:
© 2022 Elsevier B.V.
PY - 2022/8
Y1 - 2022/8
N2 - In the medical domain, data are often collected over time, evolving from simple to refined categories. The data and the underlying structures of the medical data as to how they have grown to today's complexity can be decomposed into crude forms when data collection starts. For instance, the cancer dataset is labeled either benign or malignant at its simplest or perhaps the earliest form. As medical knowledge advances and/or more data become available, the dataset progresses from binary class to multi-class, having more labels of sub-categories of the disease added. In machine learning, inducing a multi-class model requires more computational power. Model optimization is enforced over the multi-class models for the highest possible accuracy, which of course, is necessary for life-and-death decision making. This model optimization task consumes an extremely long model training time. In this paper, a novel strategy called Group-of-Single-Class prediction (GOSC) coupled with majority voting and model transfer is proposed for achieving maximum accuracy by using only a fraction of the model training time. The main advantage is the ability to achieve an optimized multi-class classification model that has the highest possible accuracy near to the absolute maximum, while the training time could be saved by up to 70%. Experiments on machine learning over liver dataset classification and deep learning over COVID19 lung CT images were tested. Preliminary results suggest the feasibility of this new approach.
AB - In the medical domain, data are often collected over time, evolving from simple to refined categories. The data and the underlying structures of the medical data as to how they have grown to today's complexity can be decomposed into crude forms when data collection starts. For instance, the cancer dataset is labeled either benign or malignant at its simplest or perhaps the earliest form. As medical knowledge advances and/or more data become available, the dataset progresses from binary class to multi-class, having more labels of sub-categories of the disease added. In machine learning, inducing a multi-class model requires more computational power. Model optimization is enforced over the multi-class models for the highest possible accuracy, which of course, is necessary for life-and-death decision making. This model optimization task consumes an extremely long model training time. In this paper, a novel strategy called Group-of-Single-Class prediction (GOSC) coupled with majority voting and model transfer is proposed for achieving maximum accuracy by using only a fraction of the model training time. The main advantage is the ability to achieve an optimized multi-class classification model that has the highest possible accuracy near to the absolute maximum, while the training time could be saved by up to 70%. Experiments on machine learning over liver dataset classification and deep learning over COVID19 lung CT images were tested. Preliminary results suggest the feasibility of this new approach.
KW - Algorithm
KW - Classification model training
KW - Deep learning
KW - Machine learning
KW - Medical dataset
KW - Multi-class classification
KW - Parameter optimization
KW - Radiological images recognition
UR - http://www.scopus.com/inward/record.url?scp=85126595008&partnerID=8YFLogxK
U2 - 10.1016/j.future.2022.02.022
DO - 10.1016/j.future.2022.02.022
M3 - Article
AN - SCOPUS:85126595008
SN - 0167-739X
VL - 133
SP - 10
EP - 22
JO - Future Generation Computer Systems
JF - Future Generation Computer Systems
ER -