Abstracting the storage and retrieval of image data at the LSST. (arXiv:1812.08085v1 [astro-ph.IM])
<a href="http://arxiv.org/find/astro-ph/1/au:+Jenness_T/0/1/0/all/0/1">Tim Jenness</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Bosch_J/0/1/0/all/0/1">James F. Bosch</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Schellart_P/0/1/0/all/0/1">Pim Schellart</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Lim_K/0/1/0/all/0/1">Kian-Ta Lim</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Salnikov_A/0/1/0/all/0/1">Andrei Salnikov</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Gower_M/0/1/0/all/0/1">Michelle Gower</a>

Writing generic data processing pipelines requires that the algorithmic code
does not ever have to know about data formats of files, or the locations of
those files. At LSST we have a software system known as “the Data Butler,” that
abstracts these details from the software developer. Scientists can specify the
dataset they want in terms they understand, such as filter, observation
identifier, date of observation, and instrument name, and the Butler translates
that to one or more files which are read and returned to them as a single
Python object. Conversely, once they have created a new dataset they can give
it back to the Butler, with a label describing its new status, and the Butler
can write it in whatever format it has been configured to use. All
configuration is in YAML and supports standard defaults whilst allowing
overrides.

Writing generic data processing pipelines requires that the algorithmic code
does not ever have to know about data formats of files, or the locations of
those files. At LSST we have a software system known as “the Data Butler,” that
abstracts these details from the software developer. Scientists can specify the
dataset they want in terms they understand, such as filter, observation
identifier, date of observation, and instrument name, and the Butler translates
that to one or more files which are read and returned to them as a single
Python object. Conversely, once they have created a new dataset they can give
it back to the Butler, with a label describing its new status, and the Butler
can write it in whatever format it has been configured to use. All
configuration is in YAML and supports standard defaults whilst allowing
overrides.

http://arxiv.org/icons/sfx.gif