How not to obtain the redshift distribution from probabilistic redshift estimates: Under what conditions is it not inappropriate to estimate the redshift distribution N(z) by stacking photo-z PDFs?. (arXiv:2101.04675v1 [astro-ph.CO])

<a href="http://arxiv.org/find/astro-ph/1/au:+Malz_A/0/1/0/all/0/1">Alex I. Malz</a>

The scientific impact of current and upcoming photometric galaxy surveys is

contingent on our ability to obtain redshift estimates for large numbers of

faint galaxies. In the absence of spectroscopically confirmed redshifts,

broad-band photometric redshift point estimates (photo-$z$s) have been

superseded by photo-$z$ probability density functions (PDFs) that encapsulate

their nontrivial uncertainties. Initial applications of photo-$z$ PDFs in weak

gravitational lensing studies of cosmology have obtained the redshift

distribution function $mathcal{N}(z)$ by employing computationally

straightforward stacking methodologies that violate the laws of probability. In

response, mathematically self-consistent models of varying complexity have been

proposed in an effort to answer the question, “What is the right way to obtain

the redshift distribution function $mathcal{N}(z)$ from a catalog of photo-$z$

PDFs?” This letter aims to motivate adoption of such principled methods by

addressing the contrapositive of the more common presentation of such models,

answering the question, “Under what conditions do traditional stacking methods

successfully recover the true redshift distribution function $mathcal{N}(z)$?”

By placing stacking in a rigorous mathematical environment, we identify two

such conditions: those of perfectly informative data and perfectly informative

prior information. Stacking has maintained its foothold in the astronomical

community for so long because the conditions in question were only weakly

violated in the past. These conditions, however, will be strongly violated by

future galaxy surveys. We therefore conclude that stacking must be abandoned in

favor of mathematically supported methods in order to advance observational

cosmology.

The scientific impact of current and upcoming photometric galaxy surveys is

contingent on our ability to obtain redshift estimates for large numbers of

faint galaxies. In the absence of spectroscopically confirmed redshifts,

broad-band photometric redshift point estimates (photo-$z$s) have been

superseded by photo-$z$ probability density functions (PDFs) that encapsulate

their nontrivial uncertainties. Initial applications of photo-$z$ PDFs in weak

gravitational lensing studies of cosmology have obtained the redshift

distribution function $mathcal{N}(z)$ by employing computationally

straightforward stacking methodologies that violate the laws of probability. In

response, mathematically self-consistent models of varying complexity have been

proposed in an effort to answer the question, “What is the right way to obtain

the redshift distribution function $mathcal{N}(z)$ from a catalog of photo-$z$

PDFs?” This letter aims to motivate adoption of such principled methods by

addressing the contrapositive of the more common presentation of such models,

answering the question, “Under what conditions do traditional stacking methods

successfully recover the true redshift distribution function $mathcal{N}(z)$?”

By placing stacking in a rigorous mathematical environment, we identify two

such conditions: those of perfectly informative data and perfectly informative

prior information. Stacking has maintained its foothold in the astronomical

community for so long because the conditions in question were only weakly

violated in the past. These conditions, however, will be strongly violated by

future galaxy surveys. We therefore conclude that stacking must be abandoned in

favor of mathematically supported methods in order to advance observational

cosmology.

http://arxiv.org/icons/sfx.gif