The Breakthrough Listen Search for Intelligent Life: Public Data, Formats, Reduction and Archiving. (arXiv:1906.07391v1 [astro-ph.IM])
<a href="http://arxiv.org/find/astro-ph/1/au:+Lebofsky_M/0/1/0/all/0/1">Matthew Lebofsky</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Croft_S/0/1/0/all/0/1">Steve Croft</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Siemion_A/0/1/0/all/0/1">Andrew P.V. Siemion</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Price_D/0/1/0/all/0/1">Danny C. Price</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Enriquez_J/0/1/0/all/0/1">J. Emilio Enriquez</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Isaacson_H/0/1/0/all/0/1">Howard Isaacson</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+MacMahon_D/0/1/0/all/0/1">David H.E. MacMahon</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Anderson_D/0/1/0/all/0/1">David Anderson</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Brzycki_B/0/1/0/all/0/1">Bryan Brzycki</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Cobb_J/0/1/0/all/0/1">Jeff Cobb</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Czech_D/0/1/0/all/0/1">Daniel Czech</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+DeBoer_D/0/1/0/all/0/1">David DeBoer</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+DeMarines_J/0/1/0/all/0/1">Julia DeMarines</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Drew_J/0/1/0/all/0/1">Jamie Drew</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Foster_G/0/1/0/all/0/1">Griffin Foster</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Gajjar_V/0/1/0/all/0/1">Vishal Gajjar</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Gizani_N/0/1/0/all/0/1">Nectaria Gizani</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Hellbourg_G/0/1/0/all/0/1">Greg Hellbourg</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Korpela_E/0/1/0/all/0/1">Eric J. Korpela</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Lacki_B/0/1/0/all/0/1">Brian Lacki</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Sheikh_S/0/1/0/all/0/1">Sofia Sheikh</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Werthimer_D/0/1/0/all/0/1">Dan Werthimer</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Worden_P/0/1/0/all/0/1">Pete Worden</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Yu_A/0/1/0/all/0/1">Alex Yu</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Zhang_Y/0/1/0/all/0/1">Yunfan Gerry Zhang</a>

Breakthrough Listen is the most comprehensive and sensitive search for
extraterrestrial intelligence (SETI) to date, employing a collection of
international observational facilities including both radio and optical
telescopes. During the first three years of the Listen program, thousands of
targets have been observed with the Green Bank Telescope (GBT), Parkes
Telescope and Automated Planet Finder. At GBT and Parkes, observations have
been performed ranging from 700 MHz to 26 GHz, with raw data volumes averaging
over 1PB / day. A pseudo-real time software spectroscopy suite is used to
produce multi-resolution spectrograms amounting to approximately 400 GB hr^-1
GHz^-1 beam^-1. For certain targets, raw baseband voltage data is also
preserved. Observations with the Automated Planet Finder produce both
2-dimensional and 1-dimensional high resolution (R~10^5) echelle spectral data.

Although the primary purpose of Listen data acquisition is for SETI, a range
of secondary science has also been performed with these data, including studies
of fast radio bursts. Other current and potential research topics include
spectral line studies, searches for certain kinds of dark matter, probes of
interstellar scattering, pulsar searches, radio transient searches and
investigations of stellar activity. Listen data are also being used in the
development of algorithms, including machine learning approaches to modulation
scheme classification and outlier detection, that have wide applicability not
just for astronomical research but for a broad range of science and
engineering.

In this paper, we describe the hardware and software pipeline used for
collection, reduction, archival, and public dissemination of Listen data. We
describe the data formats and tools, and present Breakthrough Listen Data
Release 1.0 (BLDR 1.0), a defined set of publicly-available raw and reduced
data totalling 1 PB.

Breakthrough Listen is the most comprehensive and sensitive search for
extraterrestrial intelligence (SETI) to date, employing a collection of
international observational facilities including both radio and optical
telescopes. During the first three years of the Listen program, thousands of
targets have been observed with the Green Bank Telescope (GBT), Parkes
Telescope and Automated Planet Finder. At GBT and Parkes, observations have
been performed ranging from 700 MHz to 26 GHz, with raw data volumes averaging
over 1PB / day. A pseudo-real time software spectroscopy suite is used to
produce multi-resolution spectrograms amounting to approximately 400 GB hr^-1
GHz^-1 beam^-1. For certain targets, raw baseband voltage data is also
preserved. Observations with the Automated Planet Finder produce both
2-dimensional and 1-dimensional high resolution (R~10^5) echelle spectral data.

Although the primary purpose of Listen data acquisition is for SETI, a range
of secondary science has also been performed with these data, including studies
of fast radio bursts. Other current and potential research topics include
spectral line studies, searches for certain kinds of dark matter, probes of
interstellar scattering, pulsar searches, radio transient searches and
investigations of stellar activity. Listen data are also being used in the
development of algorithms, including machine learning approaches to modulation
scheme classification and outlier detection, that have wide applicability not
just for astronomical research but for a broad range of science and
engineering.

In this paper, we describe the hardware and software pipeline used for
collection, reduction, archival, and public dissemination of Listen data. We
describe the data formats and tools, and present Breakthrough Listen Data
Release 1.0 (BLDR 1.0), a defined set of publicly-available raw and reduced
data totalling 1 PB.

http://arxiv.org/icons/sfx.gif