Merger or Not: Accounting for Human Biases in Identifying Galactic Merger Signatures. (arXiv:2106.15618v1 [astro-ph.GA])
<a href="http://arxiv.org/find/astro-ph/1/au:+Lambrides_E/0/1/0/all/0/1">Erini Lambrides</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Watts_D/0/1/0/all/0/1">Duncan J. Watts</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Chiaberge_M/0/1/0/all/0/1">Marco Chiaberge</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Tchernyshyov_K/0/1/0/all/0/1">Kirill Tchernyshyov</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Kirkpatrick_A/0/1/0/all/0/1">Allison Kirkpatrick</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Meyer_E/0/1/0/all/0/1">Eileen T. Meyer</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Heckman_T/0/1/0/all/0/1">Timothy Heckman</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Simons_R/0/1/0/all/0/1">Raymond Simons</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Amram_O/0/1/0/all/0/1">Oz Amram</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Hall_K/0/1/0/all/0/1">Kirsten R. Hall</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Long_A/0/1/0/all/0/1">Arianna Long</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Norman_C/0/1/0/all/0/1">Colin Norman</a>

Significant galaxy mergers throughout cosmic time play a fundamental role in
theories of galaxy evolution. The widespread usage of human classifiers to
visually assess whether galaxies are in merging systems remains a fundamental
component of many morphology studies. Studies that employ human classifiers
usually construct a control sample, and rely on the assumption that the bias
introduced by using humans will be evenly applied to all samples. In this work,
we test this assumption and develop methods to correct for it. Using the
standard binomial statistical methods employed in many morphology studies, we
find that the merger fraction, error, and the significance of the difference
between two samples are dependent on the intrinsic merger fraction of any given
sample. We propose a method of quantifying merger biases of individual human
classifiers and incorporate these biases into a full probabilistic model to
determine the merger fraction and the probability of an individual galaxy being
in a merger. Using 14 simulated human responses and accuracies, we are able to
correctly label a galaxy as ”merger” or ”isolated” to within 1% of the
truth. Using 14 real human responses on a set of realistic mock galaxy
simulation snapshots our model is able to recover the pre-coalesced merger
fraction to within 10%. Our method can not only increase the accuracy of
studies probing the merger state of galaxies at cosmic noon, but also can be
used to construct more accurate training sets in machine learning studies that
use human classified data-sets.

Significant galaxy mergers throughout cosmic time play a fundamental role in
theories of galaxy evolution. The widespread usage of human classifiers to
visually assess whether galaxies are in merging systems remains a fundamental
component of many morphology studies. Studies that employ human classifiers
usually construct a control sample, and rely on the assumption that the bias
introduced by using humans will be evenly applied to all samples. In this work,
we test this assumption and develop methods to correct for it. Using the
standard binomial statistical methods employed in many morphology studies, we
find that the merger fraction, error, and the significance of the difference
between two samples are dependent on the intrinsic merger fraction of any given
sample. We propose a method of quantifying merger biases of individual human
classifiers and incorporate these biases into a full probabilistic model to
determine the merger fraction and the probability of an individual galaxy being
in a merger. Using 14 simulated human responses and accuracies, we are able to
correctly label a galaxy as ”merger” or ”isolated” to within 1% of the
truth. Using 14 real human responses on a set of realistic mock galaxy
simulation snapshots our model is able to recover the pre-coalesced merger
fraction to within 10%. Our method can not only increase the accuracy of
studies probing the merger state of galaxies at cosmic noon, but also can be
used to construct more accurate training sets in machine learning studies that
use human classified data-sets.

http://arxiv.org/icons/sfx.gif