skip to main content


Title: A Gateway to Astronomical Image Processing: Vera C. RubinObservatory LSST Science Pipelines on AWS
The Legacy Survey of Space and Time, operated by the Vera C. Rubin Observatory, is a 10-year astronomical survey due to start operations in 2022 that will image half the sky every three nights. LSST will produce ~20TB of raw data per night which will be calibrated and analyzed in almost real-time. Given the volume of LSST data, the traditional subset-download-process paradigm of data reprocessing faces significant challenges. We describe here, the first steps towards a gateway for astronomical science that would enable astronomers to analyze images and catalogs at scale. In this first step, we focus on executing the Rubin LSST Science Pipelines, a collection of image and catalog processing algorithms, on Amazon Web Services (AWS). We describe our initial impressions of the performance, scalability, and cost of deploying such a system in the cloud.  more » « less
Award ID(s):
1739419
NSF-PAR ID:
10287563
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Gateways 2020, Online
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Light echoes (LEs) are the reflections of astrophysical transients off of interstellar dust. They are fascinating astronomical phenomena that enable studies of the scattering dust as well as of the original transients. LEs, however, are rare and extremely difficult to detect as they appear as faint, diffuse, time-evolving features. The detection of LEs still largely relies on human inspection of images, a method unfeasible in the era of large synoptic surveys. The Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) will generate an unprecedented amount of astronomical imaging data at high spatial resolution, exquisite image quality, and over tens of thousands of square degrees of sky: an ideal survey for LEs. However, the Rubin data processing pipelines are optimized for the detection of point sources and will entirely miss LEs. Over the past several years, artificial intelligence (AI) object-detection frameworks have achieved and surpassed real-time, human-level performance. In this work, we leverage a data set from the Asteroid Terrestrial-impact Last Alert System telescope to test a popular AI object-detection framework, You Only Look Once, or YOLO, developed by the computer-vision community, to demonstrate the potential of AI for the detection of LEs in astronomical images. We find that an AI framework can reach human-level performance even with a size- and quality-limited data set. We explore and highlight challenges, including class imbalance and label incompleteness, and road map the work required to build an end-to-end pipeline for the automated detection and study of LEs in high-throughput astronomical surveys.

     
    more » « less
  2. Abstract

    We present the Local Volume Complete Cluster Survey (LoVoCCS; we pronounce it as “low-vox” or “law-vox,” with stress on the second syllable), an NSF’s National Optical-Infrared Astronomy Research Laboratory survey program that uses the Dark Energy Camera to map the dark matter distribution and galaxy population in 107 nearby (0.03 <z< 0.12) X-ray luminous ([0.1–2.4 keV]LX500> 1044erg s−1) galaxy clusters that are not obscured by the Milky Way. The survey will reach Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) Year 1–2 depth (for galaxiesr= 24.5,i= 24.0, signal-to-noise ratio (S/N) > 20;u= 24.7,g= 25.3,z= 23.8, S/N > 10) and conclude in ∼2023 (coincident with the beginning of LSST science operations), and will serve as a zeroth-year template for LSST transient studies. We process the data using the LSST Science Pipelines that include state-of-the-art algorithms and analyze the results using our own pipelines, and therefore the catalogs and analysis tools will be compatible with the LSST. We demonstrate the use and performance of our pipeline using three X-ray luminous and observation-time complete LoVoCCS clusters: A3911, A3921, and A85. A3911 and A3921 have not been well studied previously by weak lensing, and we obtain similar lensing analysis results for A85 to previous studies. (We mainly use A3911 to show our pipeline and give more examples in the Appendix.)

     
    more » « less
  3. Abstract The Vera C. Rubin Observatory is expected to start the Legacy Survey of Space and Time (LSST) in early to mid-2025. This multiband wide-field synoptic survey will transform our view of the solar system, with the discovery and monitoring of over five million small bodies. The final survey strategy chosen for LSST has direct implications on the discoverability and characterization of solar system minor planets and passing interstellar objects. Creating an inventory of the solar system is one of the four main LSST science drivers. The LSST observing cadence is a complex optimization problem that must balance the priorities and needs of all the key LSST science areas. To design the best LSST survey strategy, a series of operation simulations using the Rubin Observatory scheduler have been generated to explore the various options for tuning observing parameters and prioritizations. We explore the impact of the various simulated LSST observing strategies on studying the solar system’s small body reservoirs. We examine what are the best observing scenarios and review what are the important considerations for maximizing LSST solar system science. In general, most of the LSST cadence simulations produce ±5% or less variations in our chosen key metrics, but a subset of the simulations significantly hinder science returns with much larger losses in the discovery and light-curve metrics. 
    more » « less
  4. null (Ed.)
    Research in astronomy is undergoing a major paradigm shift, transformed by the advent of large, automated, sky-surveys into a data-rich field where multi-TB to PB-sized spatio-temporal data sets are commonplace. For example the Legacy Survey of Space and Time; LSST) is about to begin delivering observations of >10^10 objects, including a database with >4 x 10^13 rows of time series data. This volume presents a challenge: how should a domain-scientist with little experience in data management or distributed computing access data and perform analyses at PB-scale? We present a possible solution to this problem built on (adapted) industry standard tools and made accessible through web gateways. We have i) developed Astronomy eXtensions for Spark, AXS, a series of astronomy-specific modifications to Apache Spark allowing astronomers to tap into its computational scalability ii) deployed datasets in AXS-queriable format in Amazon S3, leveraging its I/O scalability, iii) developed a deployment of Spark on Kubernetes with auto-scaling configurations requiring no end-user interaction, and iv) provided a Jupyter notebook, web-accessible, front-end via JupyterHub including a rich library of pre-installed common astronomical software (accessible at http://hub.dirac.institute). We use this system to enable the analysis of data from the Zwicky Transient Facility, presently the closest precursor survey to the LSST, and discuss initial results. To our knowledge, this is a first application of cloud-based scalable analytics to astronomical datasets approaching LSST-scale. The code is available at https://github.com/astronomy-commons. 
    more » « less
  5. Abstract

    We present here the design, architecture, and first data release for the Solar System Notification Alert Processing System (SNAPS). SNAPS is a solar system broker that ingests alert data from all-sky surveys. At present, we ingest data from the Zwicky Transient Facility (ZTF) public survey, and we will ingest data from the forthcoming Legacy Survey of Space and Time (LSST) when it comes online. SNAPS is an official LSST downstream broker. In this paper we present the SNAPS design goals and requirements. We describe the details of our automatic pipeline processing in which the physical properties of asteroids are derived. We present SNAPShot1, our first data release, which contains 5,458,459 observations of 31,693 asteroids observed by ZTF from 2018 July to 2020 May. By comparing a number of derived properties for this ensemble to previously published results for overlapping objects we show that our automatic processing is highly reliable. We present a short list of science results, among many that will be enabled by our SNAPS catalog: (1) we demonstrate that there are no known asteroids with very short periods and high amplitudes, which clearly indicates that in general asteroids in the size range 0.3–20 km are strengthless; (2) we find no difference in the period distributions of Jupiter Trojan asteroids, implying that the L4 and L5 clouds have different shape distributions; and (3) we highlight several individual asteroids of interest. Finally, we describe future work for SNAPS and our ability to operate at LSST scale.

     
    more » « less