Mass and Age determination of the LAMOST data with different Machine Learning methods. (arXiv:2205.06144v1 [astro-ph.GA])

<a href="http://arxiv.org/find/astro-ph/1/au:+Li_Q/0/1/0/all/0/1">Qi-Da Li</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Wang_H/0/1/0/all/0/1">Hai-Feng Wang</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Luo_Y/0/1/0/all/0/1">Yang-Ping Luo</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Li_Q/0/1/0/all/0/1">Qing Li</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Deng_L/0/1/0/all/0/1">Li-Cai Deng</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Ting_Y/0/1/0/all/0/1">Yuan-Sen Ting</a>

We present a catalog of 948,216 stars with mass label and a catalog of

163,105 red clump (RC) stars with mass and age labels simultaneously. The

training dataset is cross matched from the LAMOST DR5 and high resolution

asteroseismology data, mass and age are predicted by random forest method or

convex hull algorithm. The stellar parameters with high correlation with mass

and age are extracted and the test dataset shows that the median relative error

of the prediction model for the mass of large sample is 0.03 and meanwhile, the

mass and age of red clump stars are 0.04 and 0.07. We also compare the

predicted age of red clump stars with the recent works and find that the final

uncertainty of the RC sample could reach 18% for age and 9% for mass, in the

meantime, final precision of the mass for large sample with different type of

stars could reach 13% without considering systematics, all these are implying

that this method could be widely used in the future. Moreover, we explore the

performance of different machine learning methods for our sample, including

bayesian linear regression (BYS), gradient boosting decision Tree (GBDT),

multilayer perceptron (MLP), multiple linear regression (MLR), random forest

(RF) and support vector regression (SVR). Finally we find that the performance

of nonlinear model is generally better than that of linear model, and the GBDT

and RF methods are relatively better.

We present a catalog of 948,216 stars with mass label and a catalog of

163,105 red clump (RC) stars with mass and age labels simultaneously. The

training dataset is cross matched from the LAMOST DR5 and high resolution

asteroseismology data, mass and age are predicted by random forest method or

convex hull algorithm. The stellar parameters with high correlation with mass

and age are extracted and the test dataset shows that the median relative error

of the prediction model for the mass of large sample is 0.03 and meanwhile, the

mass and age of red clump stars are 0.04 and 0.07. We also compare the

predicted age of red clump stars with the recent works and find that the final

uncertainty of the RC sample could reach 18% for age and 9% for mass, in the

meantime, final precision of the mass for large sample with different type of

stars could reach 13% without considering systematics, all these are implying

that this method could be widely used in the future. Moreover, we explore the

performance of different machine learning methods for our sample, including

bayesian linear regression (BYS), gradient boosting decision Tree (GBDT),

multilayer perceptron (MLP), multiple linear regression (MLR), random forest

(RF) and support vector regression (SVR). Finally we find that the performance

of nonlinear model is generally better than that of linear model, and the GBDT

and RF methods are relatively better.

http://arxiv.org/icons/sfx.gif

Comments are closed, but trackbacks and pingbacks are open.