TY - JOUR
T1 - Big Feature Data Analytics
T2 - Split and Combine Linear Discriminant Analysis (SC-LDA) for Integration Towards Decision Making Analytics
AU - Seng, Jasmine Kah Phooi
AU - Ang, Kenneth Li Minn
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2017/7/17
Y1 - 2017/7/17
N2 - This paper introduces a novel big feature data analytics scheme for integration toward data analytics with decision making. In this scheme, a split and combine approach for a linear discriminant analysis (LDA) algorithm termed SC-LDA is proposed. The SC-LDA replaces the full eigenvector decomposition of LDA with much cheaper eigenvector decompositions on smaller sub-matrices, and then recombines the intermediate results to obtain the exact reconstruction as for the original algorithm. The splitting or decomposition can be further applied recursively to obtain a multi-stage SC-LDA algorithm. The smaller sub-matrices can then be computed in parallel to reduce the time complexity for big data applications. The approach is discussed for an LDA algorithm variation (LDA/QR), which is suitable for the analytics of Big Feature data sets. The projected data vectors into the LDA subspace can then be integrated toward the decision-making process involving classification. Experiments are conducted on real-world data sets to confirm that our approach allows the LDA problem to be divided into the size-reduced sub-problems and can be solved in parallel while giving an exact reconstruction as for the original LDA/QR.
AB - This paper introduces a novel big feature data analytics scheme for integration toward data analytics with decision making. In this scheme, a split and combine approach for a linear discriminant analysis (LDA) algorithm termed SC-LDA is proposed. The SC-LDA replaces the full eigenvector decomposition of LDA with much cheaper eigenvector decompositions on smaller sub-matrices, and then recombines the intermediate results to obtain the exact reconstruction as for the original algorithm. The splitting or decomposition can be further applied recursively to obtain a multi-stage SC-LDA algorithm. The smaller sub-matrices can then be computed in parallel to reduce the time complexity for big data applications. The approach is discussed for an LDA algorithm variation (LDA/QR), which is suitable for the analytics of Big Feature data sets. The projected data vectors into the LDA subspace can then be integrated toward the decision-making process involving classification. Experiments are conducted on real-world data sets to confirm that our approach allows the LDA problem to be divided into the size-reduced sub-problems and can be solved in parallel while giving an exact reconstruction as for the original LDA/QR.
KW - Big data
KW - classification
KW - computational complexity
KW - feature extraction
KW - linear discriminant analysis
UR - http://www.scopus.com/inward/record.url?scp=85028909453&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2017.2726543
DO - 10.1109/ACCESS.2017.2726543
M3 - Article
AN - SCOPUS:85028909453
SN - 2169-3536
VL - 5
SP - 14056
EP - 14065
JO - IEEE Access
JF - IEEE Access
M1 - 7982953
ER -