How not to obtain the redshift distribution from probabilistic redshift estimates: Under what conditions is it not inappropriate to estimate the redshift distribution N(z) by stacking photo-z PDFs?. (arXiv:2101.04675v1 [astro-ph.CO])
<a href="http://arxiv.org/find/astro-ph/1/au:+Malz_A/0/1/0/all/0/1">Alex I. Malz</a>

The scientific impact of current and upcoming photometric galaxy surveys is
contingent on our ability to obtain redshift estimates for large numbers of
faint galaxies. In the absence of spectroscopically confirmed redshifts,
broad-band photometric redshift point estimates (photo-$z$s) have been
superseded by photo-$z$ probability density functions (PDFs) that encapsulate
their nontrivial uncertainties. Initial applications of photo-$z$ PDFs in weak
gravitational lensing studies of cosmology have obtained the redshift
distribution function $mathcal{N}(z)$ by employing computationally
straightforward stacking methodologies that violate the laws of probability. In
response, mathematically self-consistent models of varying complexity have been
proposed in an effort to answer the question, “What is the right way to obtain
the redshift distribution function $mathcal{N}(z)$ from a catalog of photo-$z$
PDFs?” This letter aims to motivate adoption of such principled methods by
addressing the contrapositive of the more common presentation of such models,
answering the question, “Under what conditions do traditional stacking methods
successfully recover the true redshift distribution function $mathcal{N}(z)$?”
By placing stacking in a rigorous mathematical environment, we identify two
such conditions: those of perfectly informative data and perfectly informative
prior information. Stacking has maintained its foothold in the astronomical
community for so long because the conditions in question were only weakly
violated in the past. These conditions, however, will be strongly violated by
future galaxy surveys. We therefore conclude that stacking must be abandoned in
favor of mathematically supported methods in order to advance observational
cosmology.

The scientific impact of current and upcoming photometric galaxy surveys is
contingent on our ability to obtain redshift estimates for large numbers of
faint galaxies. In the absence of spectroscopically confirmed redshifts,
broad-band photometric redshift point estimates (photo-$z$s) have been
superseded by photo-$z$ probability density functions (PDFs) that encapsulate
their nontrivial uncertainties. Initial applications of photo-$z$ PDFs in weak
gravitational lensing studies of cosmology have obtained the redshift
distribution function $mathcal{N}(z)$ by employing computationally
straightforward stacking methodologies that violate the laws of probability. In
response, mathematically self-consistent models of varying complexity have been
proposed in an effort to answer the question, “What is the right way to obtain
the redshift distribution function $mathcal{N}(z)$ from a catalog of photo-$z$
PDFs?” This letter aims to motivate adoption of such principled methods by
addressing the contrapositive of the more common presentation of such models,
answering the question, “Under what conditions do traditional stacking methods
successfully recover the true redshift distribution function $mathcal{N}(z)$?”
By placing stacking in a rigorous mathematical environment, we identify two
such conditions: those of perfectly informative data and perfectly informative
prior information. Stacking has maintained its foothold in the astronomical
community for so long because the conditions in question were only weakly
violated in the past. These conditions, however, will be strongly violated by
future galaxy surveys. We therefore conclude that stacking must be abandoned in
favor of mathematically supported methods in order to advance observational
cosmology.

http://arxiv.org/icons/sfx.gif