Hyperspecialized Compilation for Serverless Data Analytics

Spiegelberg, Leonhard; Kraska, Tim; Schwarzkopf, Malte

Citation Details

Serverless functions can be spun up in milliseconds and scaled out quickly, forming an ideal platform for quick, interactive parallel queries over large data sets. Modern databases use code generation to produce efficient physical plans, but compiling such a plan on each serverless function is costly: every millisecond spent executing on serverless functions multiplies in cost by the number of functions running. Existing serverless data science frameworks therefore generate and compile code on the client, which precludes specializing this code to patterns that may exist in the input data of individual serverless functions. This paper argues for exploring a trade-off space between one-off code generation on the client, and hyperspecialized compilation that generates bespoke code on each serverless function. Our preliminary experiments show that hyperspecialization outperforms client-based compilation on typical heterogeneous datasets in both cost and performance by 2–4×. more »

Award ID(s):: 2039354

PAR ID:: 10486258

Author(s) / Creator(s):: Spiegelberg, Leonhard; Kraska, Tim; Schwarzkopf, Malte

Publisher / Repository:: CEUR-WS

Date Published:: 2023-08-28

Journal Name:: Joint Proceedings of Workshops at the 49th International Conference on Very Large Data Bases (VLDB 2023)

Format(s):: Medium: X

Location:: Vancouver, Canada

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this