Probabilistic learning for pulsar classification. (arXiv:2205.05765v1 [astro-ph.IM])
<a href="http://arxiv.org/find/astro-ph/1/au:+Andrianomena_S/0/1/0/all/0/1">Sambatra Andrianomena</a>

In this work, we explore the possibility of using probabilistic learning to
identify pulsar candidates. We make use of Deep Gaussian Process (DGP) and Deep
Kernel Learning (DKL). Trained on a balanced training set in order to avoid the
effect of class imbalance, the performance of the models, achieving relatively
high probability of differentiating the positive class from the negative one
($roc$-$auc sim 0.98$), is very promising overall. We estimate the predictive
entropy of each model predictions and find that DKL is more confident than DGP
in its predictions and provides better uncertainty calibration. Upon
investigating the effect of training with imbalanced dataset on the models,
results show that each model performance decreases with an increasing number of
the majority class in the training set. Interestingly, with a number of
negative class $10times$ that of positive class, the models still provide
reasonably well calibrated uncertainty, i.e. an expected Uncertainty
Calibration Error (UCE) less than $6%$. We also show in this study how, in the
case of relatively small amount of training dataset, a convolutional neural
network based classifier trained via Bayesian Active Learning by Disagreement
(BALD) performs. We find that, with an optimized number of training examples,
the model — being the most confident in its predictions — generalizes
relatively well and produces the best uncertainty calibration which corresponds
to UCE = $3.118%$.

In this work, we explore the possibility of using probabilistic learning to
identify pulsar candidates. We make use of Deep Gaussian Process (DGP) and Deep
Kernel Learning (DKL). Trained on a balanced training set in order to avoid the
effect of class imbalance, the performance of the models, achieving relatively
high probability of differentiating the positive class from the negative one
($roc$-$auc sim 0.98$), is very promising overall. We estimate the predictive
entropy of each model predictions and find that DKL is more confident than DGP
in its predictions and provides better uncertainty calibration. Upon
investigating the effect of training with imbalanced dataset on the models,
results show that each model performance decreases with an increasing number of
the majority class in the training set. Interestingly, with a number of
negative class $10times$ that of positive class, the models still provide
reasonably well calibrated uncertainty, i.e. an expected Uncertainty
Calibration Error (UCE) less than $6%$. We also show in this study how, in the
case of relatively small amount of training dataset, a convolutional neural
network based classifier trained via Bayesian Active Learning by Disagreement
(BALD) performs. We find that, with an optimized number of training examples,
the model — being the most confident in its predictions — generalizes
relatively well and produces the best uncertainty calibration which corresponds
to UCE = $3.118%$.

http://arxiv.org/icons/sfx.gif