Machine learning Applied to Star-Galaxy-QSO Classification and Stellar Effective Temperature Regression. (arXiv:1811.03740v1 [astro-ph.GA])
<a href="http://arxiv.org/find/astro-ph/1/au:+Bai_Y/0/1/0/all/0/1">Yu Bai</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Liu_J/0/1/0/all/0/1">JiFeng Liu</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Wang_S/0/1/0/all/0/1">Song Wang</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Yang_F/0/1/0/all/0/1">Fan Yang</a>

In modern astrophysics, the machine learning has increasingly gained more
popularity with its incredibly powerful ability to make predictions or
calculated suggestions for large amounts of data. We describe an application of
the supervised machine-learning algorithm, random forests (RF), to the
star/galaxy/QSO classification and the stellar effective temperature regression
based on the combination of LAMOST and SDSS spectroscopic data. This
combination enable us to obtain reliable predictions with one of the largest
training sample ever used. The training samples are built with nine-color data
set of about three million objects for the classification and seven-color data
set of over one million stars for the regression. The performance of the
classification and regression is examined with the validation and the blind
tests on the objects in the RAVE, 6dFGS, UVQS and APOGEE surveys. We
demonstrate that the RF is an effective algorithm with the classification
accuracies higher than 99% for the stars and the galaxies, and higher than
94% for the QSOs. These accuracies are higher than the machine-learning
results in the former studies. The total standard deviations of the regression
are smaller than 200 K that is similar to those of some spectrum-based methods.
The machine-learning algorithm with the broad-band photometry provides us a
more efficient approach to deal with massive amounts of astrophysical data than
traditional color-cuts and SED fit.

In modern astrophysics, the machine learning has increasingly gained more
popularity with its incredibly powerful ability to make predictions or
calculated suggestions for large amounts of data. We describe an application of
the supervised machine-learning algorithm, random forests (RF), to the
star/galaxy/QSO classification and the stellar effective temperature regression
based on the combination of LAMOST and SDSS spectroscopic data. This
combination enable us to obtain reliable predictions with one of the largest
training sample ever used. The training samples are built with nine-color data
set of about three million objects for the classification and seven-color data
set of over one million stars for the regression. The performance of the
classification and regression is examined with the validation and the blind
tests on the objects in the RAVE, 6dFGS, UVQS and APOGEE surveys. We
demonstrate that the RF is an effective algorithm with the classification
accuracies higher than 99% for the stars and the galaxies, and higher than
94% for the QSOs. These accuracies are higher than the machine-learning
results in the former studies. The total standard deviations of the regression
are smaller than 200 K that is similar to those of some spectrum-based methods.
The machine-learning algorithm with the broad-band photometry provides us a
more efficient approach to deal with massive amounts of astrophysical data than
traditional color-cuts and SED fit.

http://arxiv.org/icons/sfx.gif