Analytic marginalization of absorption line continua. (arXiv:1901.06416v1 [astro-ph.IM])
<a href="http://arxiv.org/find/astro-ph/1/au:+Tchernyshyov_K/0/1/0/all/0/1">Kirill Tchernyshyov</a>

Absorption line spectroscopy is a powerful way of measuring properties of
stars and the interstellar medium. Absorption spectra are often analyzed
manually, an approach that limits reproducibility and which cannot practically
be applied to modern datasets consisting of thousands or even millions of
spectra. Simultaneous probabilistic modeling of absorption features and
continuum shape is a promising approach for automating this analysis. Existing
implementations of this approach use numerical methods such as Markov chain
Monte Carlo (MCMC) to marginalize over the continuum parameters. Numerical
marginalization over large numbers of continuum parameters is too slow to be
convenient for exploratory analysis, can increase the dimensionality of an
inference problem beyond the capacity of simple MCMC samplers, and is in
general impractical for the analysis of large datasets. When continua are
parameterized as linear functions such as polynomials or splines, it is
possible to reduce continuum parameter marginalization to an integral over a
multivariate normal distribution, which has a known closed form. In addition to
speeding up probabilistic modeling, analytic marginalization makes it trivial
to marginalize over continuum parameterizations and to combine continuum
description marginalization with optimization for absorption line parameters.
These new possibilities allow automatic, probabilistically justified continuum
placement in analyses of large spectroscopic datasets. We compare the accuracy
to within which absorption line parameters can be recovered using different
continuum placement methods and find that marginalization is in many cases an
improvement over other methods. We implement analytic marginalization over
linear continuum parameters in the open-source package amlc.

Absorption line spectroscopy is a powerful way of measuring properties of
stars and the interstellar medium. Absorption spectra are often analyzed
manually, an approach that limits reproducibility and which cannot practically
be applied to modern datasets consisting of thousands or even millions of
spectra. Simultaneous probabilistic modeling of absorption features and
continuum shape is a promising approach for automating this analysis. Existing
implementations of this approach use numerical methods such as Markov chain
Monte Carlo (MCMC) to marginalize over the continuum parameters. Numerical
marginalization over large numbers of continuum parameters is too slow to be
convenient for exploratory analysis, can increase the dimensionality of an
inference problem beyond the capacity of simple MCMC samplers, and is in
general impractical for the analysis of large datasets. When continua are
parameterized as linear functions such as polynomials or splines, it is
possible to reduce continuum parameter marginalization to an integral over a
multivariate normal distribution, which has a known closed form. In addition to
speeding up probabilistic modeling, analytic marginalization makes it trivial
to marginalize over continuum parameterizations and to combine continuum
description marginalization with optimization for absorption line parameters.
These new possibilities allow automatic, probabilistically justified continuum
placement in analyses of large spectroscopic datasets. We compare the accuracy
to within which absorption line parameters can be recovered using different
continuum placement methods and find that marginalization is in many cases an
improvement over other methods. We implement analytic marginalization over
linear continuum parameters in the open-source package amlc.

http://arxiv.org/icons/sfx.gif