Galaxy Zoo: Probabilistic Morphology through Bayesian CNNs and Active Learning. (arXiv:1905.07424v2 [astro-ph.GA] UPDATED)
<a href="http://arxiv.org/find/astro-ph/1/au:+Walmsley_M/0/1/0/all/0/1">Mike Walmsley</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Smith_L/0/1/0/all/0/1">Lewis Smith</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Lintott_C/0/1/0/all/0/1">Chris Lintott</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Gal_Y/0/1/0/all/0/1">Yarin Gal</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Bamford_S/0/1/0/all/0/1">Steven Bamford</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Dickinson_H/0/1/0/all/0/1">Hugh Dickinson</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Fortson_L/0/1/0/all/0/1">Lucy Fortson</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Kruk_S/0/1/0/all/0/1">Sandor Kruk</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Masters_K/0/1/0/all/0/1">Karen Masters</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Scarlata_C/0/1/0/all/0/1">Claudia Scarlata</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Simmons_B/0/1/0/all/0/1">Brooke Simmons</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Smethurst_R/0/1/0/all/0/1">Rebecca Smethurst</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Wright_D/0/1/0/all/0/1">Darryl Wright</a>

We use Bayesian convolutional neural networks and a novel generative model of
Galaxy Zoo volunteer responses to infer posteriors for the visual morphology of
galaxies. Bayesian CNN can learn from galaxy images with uncertain labels and
then, for previously unlabelled galaxies, predict the probability of each
possible label. Our posteriors are well-calibrated (e.g. for predicting bars,
we achieve coverage errors of 11.8% within a vote fraction deviation of 0.2)
and hence are reliable for practical use. Further, using our posteriors, we
apply the active learning strategy BALD to request volunteer responses for the
subset of galaxies which, if labelled, would be most informative for training
our network. We show that training our Bayesian CNNs using active learning
requires up to 35-60% fewer labelled galaxies, depending on the morphological
feature being classified. By combining human and machine intelligence, Galaxy
Zoo will be able to classify surveys of any conceivable scale on a timescale of
weeks, providing massive and detailed morphology catalogues to support research
into galaxy evolution.

We use Bayesian convolutional neural networks and a novel generative model of
Galaxy Zoo volunteer responses to infer posteriors for the visual morphology of
galaxies. Bayesian CNN can learn from galaxy images with uncertain labels and
then, for previously unlabelled galaxies, predict the probability of each
possible label. Our posteriors are well-calibrated (e.g. for predicting bars,
we achieve coverage errors of 11.8% within a vote fraction deviation of 0.2)
and hence are reliable for practical use. Further, using our posteriors, we
apply the active learning strategy BALD to request volunteer responses for the
subset of galaxies which, if labelled, would be most informative for training
our network. We show that training our Bayesian CNNs using active learning
requires up to 35-60% fewer labelled galaxies, depending on the morphological
feature being classified. By combining human and machine intelligence, Galaxy
Zoo will be able to classify surveys of any conceivable scale on a timescale of
weeks, providing massive and detailed morphology catalogues to support research
into galaxy evolution.

http://arxiv.org/icons/sfx.gif