Catalog of quasars from the Kilo-Degree Survey Data Release 3. (arXiv:1812.03084v1 [astro-ph.IM])
<a href="http://arxiv.org/find/astro-ph/1/au:+Nakoneczny_S/0/1/0/all/0/1">S. Nakoneczny</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Bilicki_M/0/1/0/all/0/1">M. Bilicki</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Solarz_A/0/1/0/all/0/1">A. Solarz</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Pollo_A/0/1/0/all/0/1">A. Pollo</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Maddox_N/0/1/0/all/0/1">N. Maddox</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Spiniello_C/0/1/0/all/0/1">C. Spiniello</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Brescia_M/0/1/0/all/0/1">M. Brescia</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Napolitano_N/0/1/0/all/0/1">N.R. Napolitano</a>

We present a catalog of quasars selected from broad-band photometric ugri
data of the Kilo-Degree Survey Data Release 3 (KiDS DR3). The QSOs are
identified by the Random Forest supervised machine learning model, trained on
SDSS DR14 spectroscopic data. We first clean the input KiDS data from entries
with excessively noisy, missing or otherwise problematic measurements. Applying
a feature importance analysis, we then tune the algorithm and identify in the
KiDS multiband catalog the 17 most useful features for the classification,
namely magnitudes, colors, magnitude ratios, and the stellarity index. We use
the t-SNE algorithm to map the multi-dimensional photometric data onto 2D
planes and compare the coverage of the training and inference sets. We limit
the inference set to r<22 to avoid extrapolation beyond the feature space covered by training, as the SDSS spectroscopic sample is considerably shallower than KiDS. This gives 3.4 million objects in the final inference sample, from which the Random Forest identified 190,000 quasar candidates. Accuracy of 97%, purity of 91%, and completeness of 87%, as derived from a test set extracted from SDSS and not used in the training, are confirmed by comparison with external spectroscopic and photometric QSO catalogs overlapping with the KiDS footprint. The robustness of our results is strengthened by number counts of the quasar candidates in the r band, as well as by their mid-infrared colors available from WISE. An analysis of parallaxes and proper motions of our QSO candidates found also in Gaia DR2 suggests that a probability cut of p(QSO)>0.8
is optimal for purity, whereas p(QSO)>0.7 is preferable for better
completeness. Our study presents the first comprehensive quasar selection from
deep high-quality KiDS data and will serve as the basis for versatile studies
of the QSO population detected by this survey.

We present a catalog of quasars selected from broad-band photometric ugri
data of the Kilo-Degree Survey Data Release 3 (KiDS DR3). The QSOs are
identified by the Random Forest supervised machine learning model, trained on
SDSS DR14 spectroscopic data. We first clean the input KiDS data from entries
with excessively noisy, missing or otherwise problematic measurements. Applying
a feature importance analysis, we then tune the algorithm and identify in the
KiDS multiband catalog the 17 most useful features for the classification,
namely magnitudes, colors, magnitude ratios, and the stellarity index. We use
the t-SNE algorithm to map the multi-dimensional photometric data onto 2D
planes and compare the coverage of the training and inference sets. We limit
the inference set to r<22 to avoid extrapolation beyond the feature space
covered by training, as the SDSS spectroscopic sample is considerably shallower
than KiDS. This gives 3.4 million objects in the final inference sample, from
which the Random Forest identified 190,000 quasar candidates. Accuracy of 97%,
purity of 91%, and completeness of 87%, as derived from a test set extracted
from SDSS and not used in the training, are confirmed by comparison with
external spectroscopic and photometric QSO catalogs overlapping with the KiDS
footprint. The robustness of our results is strengthened by number counts of
the quasar candidates in the r band, as well as by their mid-infrared colors
available from WISE. An analysis of parallaxes and proper motions of our QSO
candidates found also in Gaia DR2 suggests that a probability cut of p(QSO)>0.8
is optimal for purity, whereas p(QSO)>0.7 is preferable for better
completeness. Our study presents the first comprehensive quasar selection from
deep high-quality KiDS data and will serve as the basis for versatile studies
of the QSO population detected by this survey.

http://arxiv.org/icons/sfx.gif