Real-Time Value-Driven Data Augmentation in the Era of LSST. (arXiv:2003.08943v1 [astro-ph.IM])
<a href="http://arxiv.org/find/astro-ph/1/au:+Sravan_N/0/1/0/all/0/1">Niharika Sravan</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Milisavljevic_D/0/1/0/all/0/1">Dan Milisavljevic</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Reynolds_J/0/1/0/all/0/1">Jack M. Reynolds</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Lentner_G/0/1/0/all/0/1">Geoffrey Lentner</a>, <a href="http://arxiv.org/find/astro-ph/1/au:+Linvill_M/0/1/0/all/0/1">Mark Linvill</a> (Department of Physics and Astronomy, Purdue University)

The deluge of data from time-domain surveys is rendering traditional
human-guided data collection and inference techniques impractical. We propose a
novel approach for conducting data collection for science inference in the era
of massive large-scale surveys that uses value-based metrics to autonomously
strategize and co-ordinate follow-up in real-time. We demonstrate the
underlying principles in the Recommender Engine For Intelligent Transient
Tracking (REFITT) that ingests live alerts from surveys and value-added inputs
from data brokers to predict the future behavior of transients and design
optimal data augmentation strategies given a set of scientific objectives. The
prototype presented in this paper is tested to work given simulated Rubin
Observatory Legacy Survey of Space and Time (LSST) core-collapse supernova (CC
SN) light-curves from the PLAsTiCC dataset. CC SNe were selected for the
initial development phase as they are known to be difficult to classify, with
the expectation that any learning techniques for them should be at least as
effective for other transients. We demonstrate the behavior of REFITT on a
random LSST night given ~32000 live CC SNe of interest. The system makes good
predictions for the photometric behavior of the events and uses them to plan
follow-up using a simple data-driven metric. We argue that machine-directed
follow-up maximizes the scientific potential of surveys and follow-up resources
by reducing downtime and bias in data collection.

The deluge of data from time-domain surveys is rendering traditional
human-guided data collection and inference techniques impractical. We propose a
novel approach for conducting data collection for science inference in the era
of massive large-scale surveys that uses value-based metrics to autonomously
strategize and co-ordinate follow-up in real-time. We demonstrate the
underlying principles in the Recommender Engine For Intelligent Transient
Tracking (REFITT) that ingests live alerts from surveys and value-added inputs
from data brokers to predict the future behavior of transients and design
optimal data augmentation strategies given a set of scientific objectives. The
prototype presented in this paper is tested to work given simulated Rubin
Observatory Legacy Survey of Space and Time (LSST) core-collapse supernova (CC
SN) light-curves from the PLAsTiCC dataset. CC SNe were selected for the
initial development phase as they are known to be difficult to classify, with
the expectation that any learning techniques for them should be at least as
effective for other transients. We demonstrate the behavior of REFITT on a
random LSST night given ~32000 live CC SNe of interest. The system makes good
predictions for the photometric behavior of the events and uses them to plan
follow-up using a simple data-driven metric. We argue that machine-directed
follow-up maximizes the scientific potential of surveys and follow-up resources
by reducing downtime and bias in data collection.

http://arxiv.org/icons/sfx.gif