Compression of Solar Spectroscopic Observations: a Case Study of Mg II k Spectral Line Profiles Observed by NASA’s IRIS Satellite. (arXiv:2103.07373v2 [astro-ph.SR] UPDATED)
<a href="http://arxiv.org/find/astro-ph/1/au:+Sadykov_V/0/1/0/all/0/1">Viacheslav M Sadykov</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Kitiashvili_I/0/1/0/all/0/1">Irina N Kitiashvili</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Dalda_A/0/1/0/all/0/1">Alberto Sainz Dalda</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Oria_V/0/1/0/all/0/1">Vincent Oria</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Kosovichev_A/0/1/0/all/0/1">Alexander G Kosovichev</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Illarionov_E/0/1/0/all/0/1">Egor Illarionov</a>

In this study we extract the deep features and investigate the compression of
the Mg II k spectral line profiles observed in quiet Sun regions by NASA’s IRIS
satellite. The data set of line profiles used for the analysis was obtained on
April 20th, 2020, at the center of the solar disc, and contains almost 300,000
individual Mg II k line profiles after data cleaning. The data are separated
into train and test subsets. The train subset was used to train the autoencoder
of the varying embedding layer size. The early stopping criterion was
implemented on the test subset to prevent the model from overfitting. Our
results indicate that it is possible to compress the spectral line profiles
more than 27 times (which corresponds to the reduction of the data
dimensionality from 110 to 4) while having a 4 DN average reconstruction error,
which is comparable to the variations in the line continuum. The mean squared
error and the reconstruction error of even statistical moments sharply decrease
when the dimensionality of the embedding layer increases from 1 to 4 and almost
stop decreasing for higher numbers. The observed occasional improvements in
training for values higher than 4 indicate that a better compact embedding may
potentially be obtained if other training strategies and longer training times
are used. The features learned for the critical four-dimensional case can be
interpreted. In particular, three of these four features mainly control the
line width, line asymmetry, and line dip formation respectively. The presented
results are the first attempt to obtain a compact embedding for spectroscopic
line profiles and confirm the value of this approach, in particular for feature
extraction, data compression, and denoising.

In this study we extract the deep features and investigate the compression of
the Mg II k spectral line profiles observed in quiet Sun regions by NASA’s IRIS
satellite. The data set of line profiles used for the analysis was obtained on
April 20th, 2020, at the center of the solar disc, and contains almost 300,000
individual Mg II k line profiles after data cleaning. The data are separated
into train and test subsets. The train subset was used to train the autoencoder
of the varying embedding layer size. The early stopping criterion was
implemented on the test subset to prevent the model from overfitting. Our
results indicate that it is possible to compress the spectral line profiles
more than 27 times (which corresponds to the reduction of the data
dimensionality from 110 to 4) while having a 4 DN average reconstruction error,
which is comparable to the variations in the line continuum. The mean squared
error and the reconstruction error of even statistical moments sharply decrease
when the dimensionality of the embedding layer increases from 1 to 4 and almost
stop decreasing for higher numbers. The observed occasional improvements in
training for values higher than 4 indicate that a better compact embedding may
potentially be obtained if other training strategies and longer training times
are used. The features learned for the critical four-dimensional case can be
interpreted. In particular, three of these four features mainly control the
line width, line asymmetry, and line dip formation respectively. The presented
results are the first attempt to obtain a compact embedding for spectroscopic
line profiles and confirm the value of this approach, in particular for feature
extraction, data compression, and denoising.

http://arxiv.org/icons/sfx.gif