Substructure in the stellar halo near the Sun. I. Data-driven clustering in Integrals of Motion space. (arXiv:2201.02404v2 [astro-ph.GA] UPDATED)
<a href="http://arxiv.org/find/astro-ph/1/au:+Lovdal_S/0/1/0/all/0/1">S. Sofie L&#xf6;vdal</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Ruiz_Lara_T/0/1/0/all/0/1">Tom&#xe1;s Ruiz-Lara</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Koppelman_H/0/1/0/all/0/1">Helmer H. Koppelman</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Matsuno_T/0/1/0/all/0/1">Tadafumi Matsuno</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Dodd_E/0/1/0/all/0/1">Emma Dodd</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Helmi_A/0/1/0/all/0/1">Amina Helmi</a>

Aims: Develop a data-driven and statistically based method for finding such
clumps in Integrals of Motion space for nearby halo stars and evaluating their
significance robustly. Methods: We use data from Gaia EDR3 extended with radial
velocities from ground-based spectroscopic surveys to construct a sample of
halo stars within 2.5 kpc from the Sun. We apply a hierarchical clustering
method that uses the single linkage algorithm in a 3D space defined by the
commonly used integrals of motion energy $E$, together with two components of
the angular momentum, $L_z$ and $L_perp$. To evaluate the statistical
significance of the clusters found, we compare the density within an
ellipsoidal region centered on the cluster to that of random sets with similar
global dynamical properties. We pick out the signal at the location of their
maximum statistical significance in the hierarchical tree. We estimate the
proximity of a star to the cluster center using the Mahalanobis distance. We
also apply the HDBSCAN clustering algorithm in velocity space. Results: Our
procedure identifies 67 highly significant clusters ($ > 3sigma$), containing
12% of the sources in our halo set, and in total 232 subgroups or individual
streams in velocity space. In total, 13.8% of the stars in our data set can be
confidently associated to a significant cluster based on their Mahalanobis
distance. Inspection of our data set reveals a complex web of relationships
between the significant clusters, suggesting that they can be tentatively
grouped into at least 6 main structures, many of which can be associated to
previously identified halo substructures, and a number of independent
substructures. This preliminary conclusion is further explored in an
accompanying paper by Ruiz-Lara et al., where we also characterize the
substructures in terms of their stellar populations. Conclusions: We find…
(abridged version)

Aims: Develop a data-driven and statistically based method for finding such
clumps in Integrals of Motion space for nearby halo stars and evaluating their
significance robustly. Methods: We use data from Gaia EDR3 extended with radial
velocities from ground-based spectroscopic surveys to construct a sample of
halo stars within 2.5 kpc from the Sun. We apply a hierarchical clustering
method that uses the single linkage algorithm in a 3D space defined by the
commonly used integrals of motion energy $E$, together with two components of
the angular momentum, $L_z$ and $L_perp$. To evaluate the statistical
significance of the clusters found, we compare the density within an
ellipsoidal region centered on the cluster to that of random sets with similar
global dynamical properties. We pick out the signal at the location of their
maximum statistical significance in the hierarchical tree. We estimate the
proximity of a star to the cluster center using the Mahalanobis distance. We
also apply the HDBSCAN clustering algorithm in velocity space. Results: Our
procedure identifies 67 highly significant clusters ($ > 3sigma$), containing
12% of the sources in our halo set, and in total 232 subgroups or individual
streams in velocity space. In total, 13.8% of the stars in our data set can be
confidently associated to a significant cluster based on their Mahalanobis
distance. Inspection of our data set reveals a complex web of relationships
between the significant clusters, suggesting that they can be tentatively
grouped into at least 6 main structures, many of which can be associated to
previously identified halo substructures, and a number of independent
substructures. This preliminary conclusion is further explored in an
accompanying paper by Ruiz-Lara et al., where we also characterize the
substructures in terms of their stellar populations. Conclusions: We find…
(abridged version)

http://arxiv.org/icons/sfx.gif