Bayesian cross validation for gravitational-wave searches in pulsar-timing array data. (arXiv:1904.05355v1 [astro-ph.IM])

<a href="http://arxiv.org/find/astro-ph/1/au:+Wang_H/0/1/0/all/0/1">Haochen Wang</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Taylor_S/0/1/0/all/0/1">Stephen R. Taylor</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Vallisneri_M/0/1/0/all/0/1">Michele Vallisneri</a>

Gravitational-wave data analysis demands sophisticated statistical noise

models in a bid to extract highly obscured signals from data. In Bayesian model

comparison, we choose among a landscape of models by comparing their marginal

likelihoods. However, this computation is numerically fraught and can be

sensitive to arbitrary choices in the specification of parameter priors. In

Bayesian cross validation, we characterize the fit and predictive power of a

model by computing the Bayesian posterior of its parameters in a training

dataset, and then use that posterior to compute the averaged likelihood of a

different testing dataset. The resulting cross-validation scores are

straightforward to compute; they are insensitive to prior tuning; and they

penalize unnecessarily complex models that overfit the training data at the

expense of predictive performance. In this article, we discuss cross validation

in the context of pulsar-timing-array data analysis, and we exemplify its

application to simulated pulsar data (where it successfully selects the correct

spectral index of a stochastic gravitational-wave background), and to a pulsar

dataset from the NANOGrav 11-year release (where it convincingly favors a model

that represents a transient feature in the interstellar medium). We argue that

cross validation offers a promising alternative to Bayesian model comparison,

and we discuss its use for gravitational-wave detection, by selecting or

refuting models that include a gravitational-wave component.

Gravitational-wave data analysis demands sophisticated statistical noise

models in a bid to extract highly obscured signals from data. In Bayesian model

comparison, we choose among a landscape of models by comparing their marginal

likelihoods. However, this computation is numerically fraught and can be

sensitive to arbitrary choices in the specification of parameter priors. In

Bayesian cross validation, we characterize the fit and predictive power of a

model by computing the Bayesian posterior of its parameters in a training

dataset, and then use that posterior to compute the averaged likelihood of a

different testing dataset. The resulting cross-validation scores are

straightforward to compute; they are insensitive to prior tuning; and they

penalize unnecessarily complex models that overfit the training data at the

expense of predictive performance. In this article, we discuss cross validation

in the context of pulsar-timing-array data analysis, and we exemplify its

application to simulated pulsar data (where it successfully selects the correct

spectral index of a stochastic gravitational-wave background), and to a pulsar

dataset from the NANOGrav 11-year release (where it convincingly favors a model

that represents a transient feature in the interstellar medium). We argue that

cross validation offers a promising alternative to Bayesian model comparison,

and we discuss its use for gravitational-wave detection, by selecting or

refuting models that include a gravitational-wave component.

http://arxiv.org/icons/sfx.gif