Building An Elastic Query Engine on Disaggregated Storage

Midhul Vuppalapati, Justin Miron

Citation Details

We present operational experience running Snowflake, a cloud- based data warehousing system with SQL support similar to state-of-the-art databases. Snowflake design is motivated by three goals: (1) compute and storage elasticity; (2) support for multi-tenancy; and, (3) high performance. Over the last few years, Snowflake has grown to serve thousands of customers executing millions of queries on petabytes of data every day. We discuss Snowflake design with a particular focus on ephemeral storage system design, query scheduling, elastic- ity and efficiently supporting multi-tenancy. Using statistics collected during execution of 70 million queries over a 14 day period, our study highlights how recent changes in cloud infrastructure have altered the many assumptions that guided the design and optimization of Snowflake, and outlines several interesting avenues of future research. more »

Award ID(s):: 1704742

PAR ID:: 10189403

Author(s) / Creator(s):: Midhul Vuppalapati, Justin Miron

Date Published:: 2020-02-25

Journal Name:: USENIX Symposium on Networked Systems Design and Implementation

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this