TY - JOUR
T1 - Cluster membership analysis with supervised learning and N-body simulations
AU - Bissekenov, A.
AU - Kalambay, M.
AU - Abdikamalov, E.
AU - Pang, X.
AU - Berczik, P.
AU - Shukirgaliyev, B.
N1 - Publisher Copyright:
© The Authors 2024.
PY - 2024/9/1
Y1 - 2024/9/1
N2 - Context. Membership analysis is an important tool for studying star clusters. There are various approaches to membership determination, including supervised and unsupervised machine-learning (ML) methods. Aims. We perform membership analysis using the supervised ML approach. Methods. We trained and tested our ML models on two sets of star cluster data: snapshots from N-body simulations, and 21 different clusters from the Gaia Data Release 3 data. Results. We explored five different ML models: random forest (RF), decision trees, support vector machines, feed-forward neural networks, and K-nearest neighbors. We find that all models produce similar results, and the accuracy of RF is slightly better. We find that a balance of classes in the datasets is optional for a successful learning. The classification accuracy strongly depends on the astrometric parameters. The addition of photometric parameters does not improve the performance. We find no strong correlation between the classification accuracy and the cluster age, mass, and half-mass radius. At the same time, models trained on clusters with a larger number of members generally produce better results.
AB - Context. Membership analysis is an important tool for studying star clusters. There are various approaches to membership determination, including supervised and unsupervised machine-learning (ML) methods. Aims. We perform membership analysis using the supervised ML approach. Methods. We trained and tested our ML models on two sets of star cluster data: snapshots from N-body simulations, and 21 different clusters from the Gaia Data Release 3 data. Results. We explored five different ML models: random forest (RF), decision trees, support vector machines, feed-forward neural networks, and K-nearest neighbors. We find that all models produce similar results, and the accuracy of RF is slightly better. We find that a balance of classes in the datasets is optional for a successful learning. The classification accuracy strongly depends on the astrometric parameters. The addition of photometric parameters does not improve the performance. We find no strong correlation between the classification accuracy and the cluster age, mass, and half-mass radius. At the same time, models trained on clusters with a larger number of members generally produce better results.
KW - Galaxy: kinematics and dynamics
KW - methods: data analysis
KW - methods: numerical
KW - open clusters and associations: general
KW - solar neighborhood
UR - http://www.scopus.com/inward/record.url?scp=85204924074&partnerID=8YFLogxK
U2 - 10.1051/0004-6361/202449791
DO - 10.1051/0004-6361/202449791
M3 - Article
AN - SCOPUS:85204924074
SN - 0004-6361
VL - 689
JO - Astronomy and Astrophysics
JF - Astronomy and Astrophysics
M1 - A282
ER -