GPU-Accelerated Hierarchical Bayesian Inference with Application to Modeling Cosmic Populations: CUDAHM. (arXiv:2105.08026v1 [astro-ph.IM])

GPU-Accelerated Hierarchical Bayesian Inference with Application to Modeling Cosmic Populations: CUDAHM. (arXiv:2105.08026v1 [astro-ph.IM])
<a href="http://arxiv.org/find/astro-ph/1/au:+Szalai_Gindl_J/0/1/0/all/0/1">János M. Szalai-Gindl</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Loredo_T/0/1/0/all/0/1">Thomas J. Loredo</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Kelly_B/0/1/0/all/0/1">Brandon C. Kelly</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Csabai_I/0/1/0/all/0/1">István Csabai</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Budavari_T/0/1/0/all/0/1">Tamás Budavári</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Dobos_L/0/1/0/all/0/1">László Dobos</a>

We describe a computational framework for hierarchical Bayesian inference
with simple (typically single-plate) parametric graphical models that uses
graphics processing units (GPUs) to accelerate computations, enabling
deployment on very large datasets. Its C++ implementation, CUDAHM (CUDA for
Hierarchical Models) exploits conditional independence between instances of a
plate, facilitating massively parallel exploration of the replication parameter
space using the single instruction, multiple data architecture of GPUs. It
provides support for constructing Metropolis-within-Gibbs samplers that iterate
between GPU-accelerated robust adaptive Metropolis sampling of plate-level
parameters conditional on upper-level parameters, and Metropolis-Hastings
sampling of upper-level parameters on the host processor conditional on the GPU
results. CUDAHM is motivated by demographic problems in astronomy, where
density estimation and linear and nonlinear regression problems must be
addressed for populations of thousands to millions of objects whose features
are measured with possibly complex uncertainties. We describe a thinned latent
point process framework for modeling such demographic data. We demonstrate
accurate GPU-accelerated parametric conditional density deconvolution for
simulated populations of up to 300,000 objects in ~1 hour using a single NVIDIA
Tesla K40c GPU. Supplementary material provides details about the CUDAHM API
and the demonstration problem.

http://arxiv.org/icons/sfx.gif