Applications of Machine Learning Algorithms In Processing Terahertz Spectroscopic Data. (arXiv:2009.01203v1 [astro-ph.IM])
<a href="http://arxiv.org/find/astro-ph/1/au:+Seo_Y/0/1/0/all/0/1">Young Min Seo</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Goldsmith_P/0/1/0/all/0/1">Paul F. Goldsmith</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Tolls_V/0/1/0/all/0/1">Volker Tolls</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Shipman_R/0/1/0/all/0/1">Russell Shipman</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Kulesa_C/0/1/0/all/0/1">Craig Kulesa</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Peters_W/0/1/0/all/0/1">William Peters</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Walker_C/0/1/0/all/0/1">Christopher Walker</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Melnick_G/0/1/0/all/0/1">Gary Melnick</a>

We present the data reduction software and the distribution of Level 1 and
Level 2 products of the Stratospheric Terahertz Observatory 2 (STO2). STO2, a
balloon-borne Terahertz telescope, surveyed star-forming regions and the
Galactic plane and produced approximately 300,000 spectra. The data are largely
similar to spectra typically produced by single-dish radio telescopes. However,
a fraction of the data contained rapidly varying fringe/baseline features and
drift noise, which could not be adequately corrected using conventional data
reduction software. To process the entire science data of the STO2 mission, we
have adopted a new method to find proper off-source spectra to reduce
large-amplitude fringes and new algorithms including Asymmetric Least Square
(ALS), Independent Component Analysis (ICA), and Density-based spatial
clustering of applications with noise (DBSCAN). The STO2 data reduction
software efficiently reduced the amplitude of fringes from a few hundred to 10
K and resulted in baselines of amplitude down to a few K. The Level 1 products
typically have the noise of a few K in [CII] spectra and ~1 K in [NII] spectra.
Using a regridding algorithm, we made spectral maps of star-forming regions and
the Galactic plane survey using an algorithm employing a Bessel-Gaussian
kernel. Level 1 and 2 products are available to the astronomical community
through the STO2 data server and the DataVerse. The software is also accessible
to the public through Github. The detailed addresses are given in Section 4 of
the paper on data distribution.

We present the data reduction software and the distribution of Level 1 and
Level 2 products of the Stratospheric Terahertz Observatory 2 (STO2). STO2, a
balloon-borne Terahertz telescope, surveyed star-forming regions and the
Galactic plane and produced approximately 300,000 spectra. The data are largely
similar to spectra typically produced by single-dish radio telescopes. However,
a fraction of the data contained rapidly varying fringe/baseline features and
drift noise, which could not be adequately corrected using conventional data
reduction software. To process the entire science data of the STO2 mission, we
have adopted a new method to find proper off-source spectra to reduce
large-amplitude fringes and new algorithms including Asymmetric Least Square
(ALS), Independent Component Analysis (ICA), and Density-based spatial
clustering of applications with noise (DBSCAN). The STO2 data reduction
software efficiently reduced the amplitude of fringes from a few hundred to 10
K and resulted in baselines of amplitude down to a few K. The Level 1 products
typically have the noise of a few K in [CII] spectra and ~1 K in [NII] spectra.
Using a regridding algorithm, we made spectral maps of star-forming regions and
the Galactic plane survey using an algorithm employing a Bessel-Gaussian
kernel. Level 1 and 2 products are available to the astronomical community
through the STO2 data server and the DataVerse. The software is also accessible
to the public through Github. The detailed addresses are given in Section 4 of
the paper on data distribution.

http://arxiv.org/icons/sfx.gif