TY - JOUR
T1 - Multiobjective feature selection for microarray data via distributed parallel algorithms
AU - Cao, Bin
AU - Zhao, Jianwei
AU - Yang, Po
AU - Yang, Peng
AU - Liu, Xin
AU - Qi, Jun
AU - Simpson, Andrew
AU - Elhoseny, Mohamed
AU - Mehmood, Irfan
AU - Muhammad, Khan
N1 - Publisher Copyright:
© 2019
PY - 2019/11
Y1 - 2019/11
N2 - Many real-world problems are large in scale and hence difficult to address. Due to the large number of features in microarray datasets, feature selection and classification are even more challenging for such datasets. Not all of these numerous features contribute to the classification task, and some even impede performance. Through feature selection, a feature subset that contains only a small quantity of essential features can be generated to increase the classification accuracy and significantly reduce the time consumption. In this paper, we construct a multiobjective feature selection model that simultaneously considers the classification error, the feature number and the feature redundancy. For this model, we propose several distributed parallel algorithms based on different encodings and an adaptive strategy. Additionally, to reduce the time consumption, various tactics are employed, including a feature number constraint, distributed parallelism and sample-wise parallelism. For a batch of microarray datasets, the proposed algorithms are superior to several state-of-the-art multiobjective evolutionary algorithms in terms of both effectiveness and efficiency.
AB - Many real-world problems are large in scale and hence difficult to address. Due to the large number of features in microarray datasets, feature selection and classification are even more challenging for such datasets. Not all of these numerous features contribute to the classification task, and some even impede performance. Through feature selection, a feature subset that contains only a small quantity of essential features can be generated to increase the classification accuracy and significantly reduce the time consumption. In this paper, we construct a multiobjective feature selection model that simultaneously considers the classification error, the feature number and the feature redundancy. For this model, we propose several distributed parallel algorithms based on different encodings and an adaptive strategy. Additionally, to reduce the time consumption, various tactics are employed, including a feature number constraint, distributed parallelism and sample-wise parallelism. For a batch of microarray datasets, the proposed algorithms are superior to several state-of-the-art multiobjective evolutionary algorithms in terms of both effectiveness and efficiency.
KW - Distributed parallelism
KW - Feature redundancy
KW - High dimension
KW - Microarray dataset
KW - Multiobjective feature selection
UR - http://www.scopus.com/inward/record.url?scp=85066756236&partnerID=8YFLogxK
U2 - 10.1016/j.future.2019.02.030
DO - 10.1016/j.future.2019.02.030
M3 - Article
AN - SCOPUS:85066756236
SN - 0167-739X
VL - 100
SP - 952
EP - 981
JO - Future Generation Computer Systems
JF - Future Generation Computer Systems
ER -