TY - JOUR
T1 - scASGC
T2 - An adaptive simplified graph convolution model for clustering single-cell RNA-seq data
AU - Wang, Shudong
AU - Zhang, Yu
AU - Zhang, Yulin
AU - Wu, Wenhao
AU - Ye, Lan
AU - Li, Yun Yin
AU - Su, Jionglong
AU - Pang, Shanchen
N1 - Funding Information:
This work was supported by the National Key Research and Development Project of China ( 2021YFA1000102 , 2021YFA1000103 ), Shandong Province Nature Science Foundation ( ZR2020MH208 , ZR2021MH104 ), the Young Taishan Scholars Program ( tsqn201909178 ) and Key Program Special Fund at Xi’an Jiaotong-Liverpool University ( KSF-A-22 ).
Publisher Copyright:
© 2023 Elsevier Ltd
PY - 2023/9
Y1 - 2023/9
N2 - Single-cell RNA sequencing (scRNA-seq) is now a successful technique for identifying cellular heterogeneity, revealing novel cell subpopulations, and forecasting developmental trajectories. A crucial component of the processing of scRNA-seq data is the precise identification of cell subpopulations. Although many unsupervised clustering methods have been developed to cluster cell subpopulations, the performance of these methods is vulnerable to dropouts and high dimensionality. In addition, most existing methods are time-consuming and fail to adequately account for potential associations between cells. In the manuscript, we present an unsupervised clustering method based on an adaptive simplified graph convolution model called scASGC. The proposed method builds plausible cell graphs, aggregates neighbor information using a simplified graph convolution model, and adaptively determines the most optimal number of convolution layers for various graphs. Experiments on 12 public datasets show that scASGC outperforms both classical and state-of-the-art clustering methods. In addition, in a study of mouse intestinal muscle containing 15,983 cells, we identified distinct marker genes based on the clustering results of scASGC. The source code of scASGC is available at https://github.com/ZzzOctopus/scASGC.
AB - Single-cell RNA sequencing (scRNA-seq) is now a successful technique for identifying cellular heterogeneity, revealing novel cell subpopulations, and forecasting developmental trajectories. A crucial component of the processing of scRNA-seq data is the precise identification of cell subpopulations. Although many unsupervised clustering methods have been developed to cluster cell subpopulations, the performance of these methods is vulnerable to dropouts and high dimensionality. In addition, most existing methods are time-consuming and fail to adequately account for potential associations between cells. In the manuscript, we present an unsupervised clustering method based on an adaptive simplified graph convolution model called scASGC. The proposed method builds plausible cell graphs, aggregates neighbor information using a simplified graph convolution model, and adaptively determines the most optimal number of convolution layers for various graphs. Experiments on 12 public datasets show that scASGC outperforms both classical and state-of-the-art clustering methods. In addition, in a study of mouse intestinal muscle containing 15,983 cells, we identified distinct marker genes based on the clustering results of scASGC. The source code of scASGC is available at https://github.com/ZzzOctopus/scASGC.
KW - Bioinformatics
KW - Clustering
KW - Computational biology
KW - Graph convolution
KW - Machine learning
KW - ScRNA-seq
UR - http://www.scopus.com/inward/record.url?scp=85162911749&partnerID=8YFLogxK
U2 - 10.1016/j.compbiomed.2023.107152
DO - 10.1016/j.compbiomed.2023.107152
M3 - Article
C2 - 37364529
AN - SCOPUS:85162911749
SN - 0010-4825
VL - 163
JO - Computers in Biology and Medicine
JF - Computers in Biology and Medicine
M1 - 107152
ER -