A distributed data warehouse system for astroparticle physics. (arXiv:1812.01906v1 [astro-ph.IM])
<a href="http://arxiv.org/find/astro-ph/1/au:+Nguyen_M/0/1/0/all/0/1">Minh-Duc Nguyen</a> (1), <a href="http://arxiv.org/find/astro-ph/1/au:+Kryukov_A/0/1/0/all/0/1">Alexander Kryukov</a> (1), <a href="http://arxiv.org/find/astro-ph/1/au:+Dubenskaya_J/0/1/0/all/0/1">Julia Dubenskaya</a> (1), <a href="http://arxiv.org/find/astro-ph/1/au:+Korosteleva_E/0/1/0/all/0/1">Elena Korosteleva</a> (1), <a href="http://arxiv.org/find/astro-ph/1/au:+Polyakov_S/0/1/0/all/0/1">Stanislav Polyakov</a> (1), <a href="http://arxiv.org/find/astro-ph/1/au:+Postnikov_E/0/1/0/all/0/1">Evgeny Postnikov</a> (1), <a href="http://arxiv.org/find/astro-ph/1/au:+Bychkov_I/0/1/0/all/0/1">Igor Bychkov</a> (2), <a href="http://arxiv.org/find/astro-ph/1/au:+Mikhailov_A/0/1/0/all/0/1">Andrey Mikhailov</a> (2), <a href="http://arxiv.org/find/astro-ph/1/au:+Shigarov_A/0/1/0/all/0/1">Alexey Shigarov</a> (2), <a href="http://arxiv.org/find/astro-ph/1/au:+Fedorov_O/0/1/0/all/0/1">Oleg Fedorov</a> (3), <a href="http://arxiv.org/find/astro-ph/1/au:+Kazarina_Y/0/1/0/all/0/1">Yulia Kazarina</a> (3), <a href="http://arxiv.org/find/astro-ph/1/au:+Shipilov_D/0/1/0/all/0/1">Dmitry Shipilov</a> (3), <a href="http://arxiv.org/find/astro-ph/1/au:+Zhurov_D/0/1/0/all/0/1">Dmitry Zhurov</a> (3) ((1) Lomonosov Moscow State University, Skobeltsyn Institute of Nuclear Physics, (2) Matrosov Institute for System Dynamics and Control Theory, Siberian Branch of Russian Academy of Sciences, (3) Applied Physics Institute, Irkutsk State University)

A distributed data warehouse system is one of the actual issues in the field
of astroparticle physics. Famous experiments, such as TAIGA, KASCADE-Grande,
produce tens of terabytes of data measured by their instruments. It is critical
to have a smart data warehouse system on-site to store the collected data for
further distribution effectively. It is also vital to provide scientists with a
handy and user-friendly interface to access the collected data with proper
permissions not only on-site but also online. The latter case is handy when
scientists need to combine data from different experiments for analysis. In
this work, we describe an approach to implementing a distributed data warehouse
system that allows scientists to acquire just the necessary data from different
experiments via the Internet on demand. The implementation is based on
CernVM-FS with additional components developed by us to search through the
whole available data sets and deliver their subsets to users’ computers.

A distributed data warehouse system is one of the actual issues in the field
of astroparticle physics. Famous experiments, such as TAIGA, KASCADE-Grande,
produce tens of terabytes of data measured by their instruments. It is critical
to have a smart data warehouse system on-site to store the collected data for
further distribution effectively. It is also vital to provide scientists with a
handy and user-friendly interface to access the collected data with proper
permissions not only on-site but also online. The latter case is handy when
scientists need to combine data from different experiments for analysis. In
this work, we describe an approach to implementing a distributed data warehouse
system that allows scientists to acquire just the necessary data from different
experiments via the Internet on demand. The implementation is based on
CernVM-FS with additional components developed by us to search through the
whole available data sets and deliver their subsets to users’ computers.

http://arxiv.org/icons/sfx.gif