pyUPMASK: an improved unsupervised clustering algorithm. (arXiv:2101.01660v1 [astro-ph.GA])

pyUPMASK: an improved unsupervised clustering algorithm. (arXiv:2101.01660v1 [astro-ph.GA])
<a href="http://arxiv.org/find/astro-ph/1/au:+Pera_M/0/1/0/all/0/1">M. S. Pera</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Perren_G/0/1/0/all/0/1">G. I. Perren</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Moitinho_A/0/1/0/all/0/1">A. Moitinho</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Navone_H/0/1/0/all/0/1">H. D. Navone</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Vazquez_R/0/1/0/all/0/1">R. A. Vazquez</a>

Aims. We present pyUPMASK, an unsupervised clustering method for stellar
clusters that builds upon the original UPMASK package. Its general approach
makes it plausible to be applied to analyses that deal with binary classes of
any kind, as long as the fundamental hypotheses are met. The code is written
entirely in Python and is made available through a public repository.
Methods.The core of the algorithm follows the method developed in UPMASK but
introducing several key enhancements. These enhancements not only make pyUPMASK
more general, they also improve its performance considerably. Results. We
thoroughly tested the performance of pyUPMASK on 600 synthetic clusters,
affected by varying degrees of contamination by field stars. To assess the
performance we employed six different statistical metrics that measure the
accuracy of probabilistic classification. Conclusions. Our results show that
pyUPMASK is better performant than UPMASK for every statistical performance
metric, while still managing to be many times faster.

http://arxiv.org/icons/sfx.gif