TY - JOUR
T1 - Learning Latent Features with Infinite Nonnegative Binary Matrix Trifactorization
AU - Yang, Xi
AU - Huang, Kaizhu
AU - Zhang, Rui
AU - Hussain, Amir
N1 - Funding Information:
This work is supported in part by the National Natural Science Foundation of China under Grant 61473236, in part by the Natural Science Fund for Colleges and Universities in Jiangsu Province under Grant 17KJD520010, in part by the Suzhou Science and Technology Program under Grant SYG201712 and Grant SZS201613, in part by the Jiangsu University Natural Science Research Programme under Grant 17KJB520041, in part by the Key Program Special Fund in XJTLU (KSF-A-01), and in part by the UK Engineering and Physical Sciences Research Council under Grant EP/M026981/1.
Publisher Copyright:
© 2017 IEEE.
PY - 2018/12
Y1 - 2018/12
N2 - Nonnegative matrix factorization (NMF) has been widely exploited in many computational intelligence and pattern recognition problems. In particular, it can be used to extract latent features from data. However, previous NMF models often assume a fixed number of features, which are normally tuned and searched using a trial and error approach. Learning binary features is also difficult, since the binary matrix posits a more challenging optimization problem. In this paper, we propose a new Bayesian model, termed the infinite nonnegative binary matrix trifactorization (iNBMT) model. This can automatically learn both latent binary features and feature numbers, based on the Indian buffet process (IBP). It exploits a trifactorization process that decomposes the nonnegative matrix into a product of three components: two binary matrices and a nonnegative real matrix. In contrast to traditional bifactorization, trifactorization can better reveal latent structures among samples and features. Specifically, an IBP prior is imposed on two infinite binary matrices, while a truncated Gaussian distribution is assumed on the weight matrix. To optimize the model, we develop a modified variational-Bayesian algorithm, with iteration complexity one order lower than the recently proposed maximization-expectation-IBP model [1] and the correlated IBP-IBP model [2]. A series of simulation experiments are carried out, both qualitatively and quantitatively, using benchmark feature extraction, reconstruction, and clustering tasks. Comparative results show that our proposed iNBMT model significantly outperforms state-of-the-art algorithms on a range of synthetic and real-world data. The new Bayesian model can thus serve as a benchmark technique for the computational intelligence research community.
AB - Nonnegative matrix factorization (NMF) has been widely exploited in many computational intelligence and pattern recognition problems. In particular, it can be used to extract latent features from data. However, previous NMF models often assume a fixed number of features, which are normally tuned and searched using a trial and error approach. Learning binary features is also difficult, since the binary matrix posits a more challenging optimization problem. In this paper, we propose a new Bayesian model, termed the infinite nonnegative binary matrix trifactorization (iNBMT) model. This can automatically learn both latent binary features and feature numbers, based on the Indian buffet process (IBP). It exploits a trifactorization process that decomposes the nonnegative matrix into a product of three components: two binary matrices and a nonnegative real matrix. In contrast to traditional bifactorization, trifactorization can better reveal latent structures among samples and features. Specifically, an IBP prior is imposed on two infinite binary matrices, while a truncated Gaussian distribution is assumed on the weight matrix. To optimize the model, we develop a modified variational-Bayesian algorithm, with iteration complexity one order lower than the recently proposed maximization-expectation-IBP model [1] and the correlated IBP-IBP model [2]. A series of simulation experiments are carried out, both qualitatively and quantitatively, using benchmark feature extraction, reconstruction, and clustering tasks. Comparative results show that our proposed iNBMT model significantly outperforms state-of-the-art algorithms on a range of synthetic and real-world data. The new Bayesian model can thus serve as a benchmark technique for the computational intelligence research community.
KW - Indian Buffet Process prior
KW - Infinite latent feature model
KW - Infinite non-negative binary matrix tri-factori-zation
UR - http://www.scopus.com/inward/record.url?scp=85082631617&partnerID=8YFLogxK
U2 - 10.1109/TETCI.2018.2806934
DO - 10.1109/TETCI.2018.2806934
M3 - Article
AN - SCOPUS:85082631617
SN - 2471-285X
VL - 2
SP - 450
EP - 463
JO - IEEE Transactions on Emerging Topics in Computational Intelligence
JF - IEEE Transactions on Emerging Topics in Computational Intelligence
IS - 6
M1 - 8320965
ER -