NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

ORION and the Three Rights: Sizing, Bundling, and Prewarming for Serverless DAGs

Mahgoub, Ashraf; Yi, Edgardo Barsallo; Shankar, Karthick; Elnikety, Sameh; Chaterji, Somali; Bagchi, Saurabh (July 2022, 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22))

Serverless applications represented as DAGs have been growing in popularity. For many of these applications, it would be useful to estimate the end-to-end (E2E) latency and to allocate resources to individual functions so as to meet probabilistic guarantees for the E2E latency. This goal has not been met till now due to three fundamental challenges. The ﬁrst is the high variability and correlation in the execution time of individual functions, the second is the skew in execution times of the parallel invocations, and the third is the incidence of cold starts. In this paper, we introduce ORION to achieve these goals. We ﬁrst analyze traces from a production FaaS infrastructure to identify three characteristics of serverless DAGs. We use these to motivate and design three features. The ﬁrst is a performance model that accounts for runtime variabilities and dependencies among functions in a DAG. The second is a method for co-locating multiple parallel invocations within a single VM thus mitigating content-based skew among these invocations. The third is a method for pre-warming VMs for subsequent functions in a DAG with the right look-ahead time. We integrate these three innovations and evaluate ORION on AWS Lambda with three serverless DAG applications. Our evaluation shows that compared to three competing approaches, ORION achieves up to 90% lower P95 latency without increasing $ cost, or up to 53% lower $ cost without increasing tail latency.
more » « less
Full Text Available
ORION and the Three Rights: Sizing, Bundling, and Prewarming for Serverless DAGs

Mahgoub, Ashraf; Barsallo, Edgardo; Shankar, Karthick; Minocha, Eshaan; Elnikety, Sameh; Bagchi, Saurabh; Chaterji, Somali. (July 2022, USENIX OSDI proceedings)

Serverless applications represented as DAGs have been growing in popularity. For many of these applications, it would be useful to estimate the end-to-end (E2E) latency and to allocate resources to individual functions so as to meet probabilistic guarantees for the E2E latency. This goal has not been met till now due to three fundamental challenges. The first is the high variability and correlation in the execution time of individual functions, the second is the skew in execution times of the parallel invocations, and the third is the incidence of cold starts. In this paper, we introduce ORION to achieve this goal. We first analyze traces from a production FaaS infrastructure to identify three characteristics of serverless DAGs. We use these to motivate and design three features. The first is a performance model that accounts for runtime variabilities and dependencies among functions in a DAG. The second is a method for co-locating multiple parallel invocations within a single VM thus mitigating content-based skew among these invocations. The third is a method for pre-warming VMs for subsequent functions in a DAG with the right look-ahead time. We integrate these three innovations and evaluate ORION on AWS Lambda with three serverless DAG applications. Our evaluation shows that compared to three competing approaches, \name achieves up to 90\% lower P95 latency without increasing \$$ cost, or up to 53\% lower \$$ cost without increasing P95 latency.
more » « less
Full Text Available
WISEFUSE: Workload Characterization and DAG Transformation for Serverless Workflows

https://doi.org/10.1145/3489048.3530959

Mahgoub, Ashraf; Yi, Edgardo Barsallo; Shankar, Karthick; Minocha, Eshaan; Elnikety, Sameh; Bagchi, Saurabh; Chaterji, Somali (June 2022, ACM SIGMETRICS)

Full Text Available
SONIC: Application-aware data passing for chained serverless applications

Mahgoub, Ashraf; Shankar, Karthick; Mitra, Subrata; Klimovic, Ana; Chaterji, Somali; Bagchi, Saurabh (January 2021, USENIX Annual Technical Conference (USENIX ATC))
null (Ed.)
Full Text Available
JANUS: Benchmarking Commercial and Open-Source Cloud and Edge Platforms for Object and Anomaly Detection Workloads

https://doi.org/10.1109/CLOUD49709.2020.00088

Shankar, Karthick; Wang, Pengcheng; Xu, Ran; Mahgoub, Ashraf; Chaterji, Somali (October 2020, 2020 IEEE 13th International Conference on Cloud Computing (CLOUD))
null (Ed.)
Full Text Available
WISEFUSE: Workload Characterization and DAG Transformation for Serverless Workflows

https://doi.org/10.1145/3530892

Mahgoub, Ashraf; Yi, Edgardo_Barsallo; Shankar, Karthick; Minocha, Eshaan; Elnikety, Sameh; Bagchi, Saurabh; Chaterji, Somali (June 2022, Proceedings of the ACM on Measurement and Analysis of Computing Systems)

We characterize production workloads of serverless DAGs at a major cloud provider. Our analysis highlights two major factors that limit performance: (a) lack of efficient communication methods between the serverless functions in the DAG, and (b) stragglers when a DAG stage invokes a set of parallel functions that must complete before starting the next DAG stage. To address these limitations, we propose WISEFUSE, an automated approach to generate an optimized execution plan for serverless DAGs for a user-specified latency objective or budget. We introduce three optimizations: (1) Fusion combines in-series functions together in a single VM to reduce the communication overhead between cascaded functions. (2) Bundling executes a group of parallel invocations of a function in one VM to improve resource sharing among the parallel workers to reduce skew. (3) Resource Allocation assigns the right VM size to each function or function bundle in the DAG to reduce the E2E latency and cost. We implement WISEFUSE to evaluate it experimentally using three popular serverless applications with different DAG structures, memory footprints, and intermediate data sizes. Compared to competing approaches and other alternatives, WISEFUSE shows significant improvements in E2E latency and cost. Specifically, for a machine learning pipeline, WISEFUSE achieves P95 latency that is 67% lower than Photons, 39% lower than Faastlane, and 90% lower than SONIC without increasing the cost.
more » « less

Search for: All records