Galaxy Zoo: Probabilistic Morphology through Bayesian CNNs and Active Learning. (arXiv:1905.07424v1 [astro-ph.GA])
<a href="http://arxiv.org/find/astro-ph/1/au:+Walmsley_M/0/1/0/all/0/1">Mike Walmsley</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Smith_L/0/1/0/all/0/1">Lewis Smith</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Lintott_C/0/1/0/all/0/1">Chris Lintott</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Gal_Y/0/1/0/all/0/1">Yarin Gal</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Bamford_S/0/1/0/all/0/1">Steven Bamford</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Dickinson_H/0/1/0/all/0/1">Hugh Dickinson</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Fortson_L/0/1/0/all/0/1">Lucy Fortson</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Kruk_S/0/1/0/all/0/1">Sandor Kruk</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Masters_K/0/1/0/all/0/1">Karen Masters</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Scarlata_C/0/1/0/all/0/1">Claudia Scarlata</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Simmons_B/0/1/0/all/0/1">Brooke Simmons</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Smethurst_R/0/1/0/all/0/1">Rebecca Smethurst</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Wright_D/0/1/0/all/0/1">Darryl Wright</a>

We use Bayesian convolutional neural networks and a novel generative model of
Galaxy Zoo volunteer responses to infer posteriors for the visual morphology of
galaxies. Bayesian CNN can learn from galaxy images with uncertain labels and
then, for previously unlabelled galaxies, predict the probability of each
possible label. Our posteriors are well-calibrated (e.g. for predicting bars,
we achieve coverage errors of 10.6% within 5 responses and 2.9% within 10
responses) and hence are reliable for practical use. Further, using our
posteriors, we apply the active learning strategy BALD to request volunteer
responses for the subset of galaxies which, if labelled, would be most
informative for training our network. We show that training our Bayesian CNNs
using active learning requires up to 35-60% fewer labelled galaxies, depending
on the morphological feature being classified. By combining human and machine
intelligence, Galaxy Zoo will be able to classify surveys of any conceivable
scale on a timescale of weeks, providing massive and detailed morphology
catalogues to support research into galaxy evolution.

We use Bayesian convolutional neural networks and a novel generative model of
Galaxy Zoo volunteer responses to infer posteriors for the visual morphology of
galaxies. Bayesian CNN can learn from galaxy images with uncertain labels and
then, for previously unlabelled galaxies, predict the probability of each
possible label. Our posteriors are well-calibrated (e.g. for predicting bars,
we achieve coverage errors of 10.6% within 5 responses and 2.9% within 10
responses) and hence are reliable for practical use. Further, using our
posteriors, we apply the active learning strategy BALD to request volunteer
responses for the subset of galaxies which, if labelled, would be most
informative for training our network. We show that training our Bayesian CNNs
using active learning requires up to 35-60% fewer labelled galaxies, depending
on the morphological feature being classified. By combining human and machine
intelligence, Galaxy Zoo will be able to classify surveys of any conceivable
scale on a timescale of weeks, providing massive and detailed morphology
catalogues to support research into galaxy evolution.

http://arxiv.org/icons/sfx.gif