TY - JOUR
T1 - Quantitative analysis of breast cancer diagnosis using a probabilistic modelling approach
AU - Liu, Shuo
AU - Zeng, Jinshu
AU - Gong, Huizhou
AU - Yang, Hongqin
AU - Zhai, Jia
AU - Cao, Yi
AU - Liu, Junxiu
AU - Luo, Yuling
AU - Li, Yuhua
AU - Maguire, Liam
AU - Ding, Xuemei
N1 - Publisher Copyright:
© 2017 Elsevier Ltd
PY - 2018/1/1
Y1 - 2018/1/1
N2 - Background Breast cancer is the most prevalent cancer in women in most countries of the world. Many computer-aided diagnostic methods have been proposed, but there are few studies on quantitative discovery of probabilistic dependencies among breast cancer data features and identification of the contribution of each feature to breast cancer diagnosis. Methods This study aims to fill this void by utilizing a Bayesian network (BN) modelling approach. A K2 learning algorithm and statistical computation methods are used to construct BN structure and assess the obtained BN model. The data used in this study were collected from a clinical ultrasound dataset derived from a Chinese local hospital and a fine-needle aspiration cytology (FNAC) dataset from UCI machine learning repository. Results Our study suggested that, in terms of ultrasound data, cell shape is the most significant feature for breast cancer diagnosis, and the resistance index presents a strong probabilistic dependency on blood signals. With respect to FNAC data, bare nuclei are the most important discriminating feature of malignant and benign breast tumours, and uniformity of both cell size and cell shape are tightly interdependent. Contributions The BN modelling approach can support clinicians in making diagnostic decisions based on the significant features identified by the model, especially when some other features are missing for specific patients. The approach is also applicable to other healthcare data analytics and data modelling for disease diagnosis.
AB - Background Breast cancer is the most prevalent cancer in women in most countries of the world. Many computer-aided diagnostic methods have been proposed, but there are few studies on quantitative discovery of probabilistic dependencies among breast cancer data features and identification of the contribution of each feature to breast cancer diagnosis. Methods This study aims to fill this void by utilizing a Bayesian network (BN) modelling approach. A K2 learning algorithm and statistical computation methods are used to construct BN structure and assess the obtained BN model. The data used in this study were collected from a clinical ultrasound dataset derived from a Chinese local hospital and a fine-needle aspiration cytology (FNAC) dataset from UCI machine learning repository. Results Our study suggested that, in terms of ultrasound data, cell shape is the most significant feature for breast cancer diagnosis, and the resistance index presents a strong probabilistic dependency on blood signals. With respect to FNAC data, bare nuclei are the most important discriminating feature of malignant and benign breast tumours, and uniformity of both cell size and cell shape are tightly interdependent. Contributions The BN modelling approach can support clinicians in making diagnostic decisions based on the significant features identified by the model, especially when some other features are missing for specific patients. The approach is also applicable to other healthcare data analytics and data modelling for disease diagnosis.
KW - Bayesian network
KW - Breast cancer diagnosis
KW - Clinical decision support
KW - Data modelling
KW - Diagnostic contribution
KW - Quantitative analysis
UR - http://www.scopus.com/inward/record.url?scp=85036476792&partnerID=8YFLogxK
U2 - 10.1016/j.compbiomed.2017.11.014
DO - 10.1016/j.compbiomed.2017.11.014
M3 - Article
C2 - 29202321
AN - SCOPUS:85036476792
SN - 0010-4825
VL - 92
SP - 168
EP - 175
JO - Computers in Biology and Medicine
JF - Computers in Biology and Medicine
ER -