Optimising Automatic Morphological Classification of Galaxies with Machine Learning and Deep Learning using Dark Energy Survey Imaging. (arXiv:1908.03610v1 [astro-ph.GA])
<a href="http://arxiv.org/find/astro-ph/1/au:+Cheng_T/0/1/0/all/0/1">Ting-Yun Cheng</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Conselice_C/0/1/0/all/0/1">Christopher J. Conselice</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Aragon_Salamanca_A/0/1/0/all/0/1">Alfonso Arag&#xf3;n-Salamanca</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Li_N/0/1/0/all/0/1">Nan Li</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Bluck_A/0/1/0/all/0/1">Asa F. L. Bluck</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Hartley_W/0/1/0/all/0/1">Will G. Hartley</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Annis_J/0/1/0/all/0/1">James Annis</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Brooks_D/0/1/0/all/0/1">David Brooks</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Doel_P/0/1/0/all/0/1">Peter Doel</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Garcia_Bellido_J/0/1/0/all/0/1">Juan Garc&#xed;a-Bellido</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+James_D/0/1/0/all/0/1">David J. James</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Kuehn_K/0/1/0/all/0/1">Kyler Kuehn</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Kuropatkin_N/0/1/0/all/0/1">Nikolay Kuropatkin</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Smith_M/0/1/0/all/0/1">Mathew Smith</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Sobreira_F/0/1/0/all/0/1">Flavia Sobreira</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Tarle_G/0/1/0/all/0/1">Gregory Tarle</a>

There are several supervised machine learning methods used for the
application of automated morphological classification of galaxies; however,
there has not yet been a clear comparison of these different methods using
imaging data, or a investigation for maximising their effectiveness. We carry
out a comparison between seven common machine learning methods for galaxy
classification (Convolutional Neural Network (CNN), K-nearest neighbour,
Logistic Regression, Support Vector Machine, and Neural Networks) by using Dark
Energy Survey (DES) data combined with visual classifications from the Galaxy
Zoo 1 project (GZ1). Our goal is to determine the optimal machine learning
methods when using imaging data for galaxy classification. We show that CNN is
the most successful method of these seven methods in our study. Using a sample
of $sim$2,800 galaxies with visual classification from GZ1, we reach an
accuracy of $sim$0.99 for the morphological classification of Ellipticals and
Spirals. The further investigation of the galaxies that were misclassified but
with high predicted probabilities in our CNN reveals the incorrect
classification provided by GZ1, and that the galaxies having a low probability
of being either spirals or ellipticals are visually Lenticulars (S0),
demonstrating that supervised learning is able to rediscover that this class of
galaxy is distinct from both Es and Spirals. We confirmed $sim$2.5% galaxies
are misclassified by GZ1 in our study. After correcting these galaxies’ label,
we improve our CNN performance to an average accuracy of over 0.99 (accuracy of
0.994 is our best result)

There are several supervised machine learning methods used for the
application of automated morphological classification of galaxies; however,
there has not yet been a clear comparison of these different methods using
imaging data, or a investigation for maximising their effectiveness. We carry
out a comparison between seven common machine learning methods for galaxy
classification (Convolutional Neural Network (CNN), K-nearest neighbour,
Logistic Regression, Support Vector Machine, and Neural Networks) by using Dark
Energy Survey (DES) data combined with visual classifications from the Galaxy
Zoo 1 project (GZ1). Our goal is to determine the optimal machine learning
methods when using imaging data for galaxy classification. We show that CNN is
the most successful method of these seven methods in our study. Using a sample
of $sim$2,800 galaxies with visual classification from GZ1, we reach an
accuracy of $sim$0.99 for the morphological classification of Ellipticals and
Spirals. The further investigation of the galaxies that were misclassified but
with high predicted probabilities in our CNN reveals the incorrect
classification provided by GZ1, and that the galaxies having a low probability
of being either spirals or ellipticals are visually Lenticulars (S0),
demonstrating that supervised learning is able to rediscover that this class of
galaxy is distinct from both Es and Spirals. We confirmed $sim$2.5% galaxies
are misclassified by GZ1 in our study. After correcting these galaxies’ label,
we improve our CNN performance to an average accuracy of over 0.99 (accuracy of
0.994 is our best result)

http://arxiv.org/icons/sfx.gif