NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Concurrency-Informed Orchestration for Serverless Functions

Liu, Qichang; Cheng, Yue; Shen, Haiying; Wang, Ao; Balaji, Bharathan (March 2025, The ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2025))

Cold start delays are a main pain point for today’s FaaS (Function-as-a-Service) platforms. A widely used mitigation strategy is keeping recently invoked function containers alive in memory to enable warm starts with minimal overhead. This paper identifies new challenges that state-of-the-art FaaS keep-alive policies neglect. These challenges are caused by concurrent function invocations, a common FaaS workload behavior. First, concurrent requests present a trade-off between reusing busy containers (delayed warm starts) versus cold-starting containers. Second, concurrent requests cause imbalanced evictions of containers that will be reused shortly thereafter. To tackle the challenges, we propose a novel serverless function container orchestration algorithm called CIDRE. CIDRE makes informed decisions to speculatively choose between a delayed warm start and a cold start under concurrency-driven function scaling. CIDRE uses both fine-grained container-level and coarse-grained concurrency information to make balanced eviction decisions. We evaluate CIDRE extensively using two production FaaS workloads. Results show that CIDRE reduces the cold start ratio and the average invocation overhead by up to 75.1% and 39.3% compared to state-of-the-art function keep-alive policies.
more » « less
Free, publicly-accessible full text available March 31, 2026
SION: Elastic Serverless Cloud Storage

Zhang, Jingyuan; Wang, Ao; Ma, Xiaolong; Carver, Benjamin; Newman, Nicholas; Anwar, Ali; Rupprecht, Lukas; Skourtis, Dimitrios; Tarasov, Vasily; Yan, Feng; et al (August 2023, International Conference on Very Large Data Bases (VLDB 2023))

Full Text Available
InfiniStore: Elastic Serverless Cloud Storage

https://doi.org/10.14778/3587136.3587139

Zhang, Jingyuan; Wang, Ao; Ma, Xiaolong; Carver, Benjamin; Newman, Nicholas John; Anwar, Ali; Rupprecht, Lukas; Tarasov, Vasily; Skourtis, Dimitrios; Yan, Feng; et al (March 2023, Proceedings of the VLDB Endowment)

Cloud object storage such as AWS S3 is cost-effective and highly elastic but relatively slow, while high-performance cloud storage such as AWS ElastiCache is expensive and provides limited elasticity. We present a new cloud storage service called ServerlessMemory, which stores data using the memory of serverless functions. ServerlessMemory employs a sliding-window-based memory management strategy inspired by the garbage collection mechanisms used in the programming language to effectively segregate hot/cold data and provides fine-grained elasticity, good performance, and a pay-per-access cost model with extremely low cost. We then design and implement InfiniStore, a persistent and elastic cloud storage system, which seamlessly couples the function-based ServerlessMemory layer with a persistent, inexpensive cloud object store layer. InfiniStore enables durability despite function failures using a fast parallel recovery scheme built on the auto-scaling functionality of a FaaS (Function-as-a-Service) platform. We evaluate InfiniStore extensively using both microbenchmarking and two real-world applications. Results show that InfiniStore has more performance benefits for objects larger than 10 MB compared to AWS ElastiCache and Anna, and InfiniStore achieves 26.25% and 97.24% tenant-side cost reduction compared to InfiniCache and ElastiCache, respectively.
more » « less
Full Text Available
FaaSNet: Scalable and Fast Provisioning of Custom Serverless Container Runtimes at Alibaba Cloud Function Compute

Wang, Ao; Chang, Shuai; Tian, Huangshi; Wang, Hongqi; Yang, Haoran; Li, Huiba; Du, Rui; Cheng, Yue. (July 2021, 2021 USENIX Annual Technical Conference (USENIX ATC 21))

Serverless computing, or Function-as-a-Service (FaaS), enables a new way of building and scaling applications by allowing users to deploy fine-grained functions while providing fully-managed resource provisioning and auto-scaling. Custom FaaS container support is gaining traction as it enables better control over OSes, versioning, and tooling for modernizing FaaS applications. However, providing rapid container provisioning introduces non-trivial challenges for FaaS providers, since container provisioning is costly, and real-world FaaS workloads exhibit highly dynamic patterns. In this paper, we design FaaSNet, a highly-scalable middleware system for accelerating FaaS container provisioning. FaaSNet is driven by the workload and infrastructure requirements of the FaaS platform at one of the world's largest cloud providers, Alibaba Cloud Function Compute. FaaSNet enables scalable container provisioning via a lightweight, adaptive function tree (FT) structure. FaaSNet uses an I/O efficient, on-demand fetching mechanism to further reduce provisioning costs at scale. We implement and integrate FaaSNet in Alibaba Cloud Function Compute. Evaluation results show that FaaSNet: (1) finishes provisioning 2,500 function containers on 1,000 virtual machines in 8.3 seconds, (2) scales 13.4× and 16.3× faster than Alibaba Cloud's current FaaS platform and a state-of-the-art P2P container registry (Kraken), respectively, and (3) sustains a bursty workload using 75.2% less time than an optimized baseline.
more » « less
Full Text Available
Wukong: A Scalable and Locality-Enhanced Framework for Serverless Parallel Computing

https://doi.org/10.1145/3419111.3421286

Carver, Benjamin; Zhang, Jingyuan; Wang, Ao; Anwar, Ali; Wu, Panruo; Cheng, Yue (October 2020, ACM Symposium on Cloud Computing 2020 (SoCC '20))
null (Ed.)
Executing complex, burst-parallel, directed acyclic graph (DAG) jobs poses a major challenge for serverless execution frameworks, which will need to rapidly scale and schedule tasks at high throughput, while minimizing data movement across tasks. We demonstrate that, for serverless parallel computations, decentralized scheduling enables scheduling to be distributed across Lambda executors that can schedule tasks in parallel, and brings multiple benefits, including enhanced data locality, reduced network I/Os, automatic resource elasticity, and improved cost effectiveness. We describe the implementation and deployment of our new serverless parallel framework, called Wukong, on AWS Lambda. We show that Wukong achieves near-ideal scalability, executes parallel computation jobs up to 68.17X faster, reduces network I/O by multiple orders of magnitude, and achieves 92.96% tenant-side cost savings compared to numpywren.
more » « less
Full Text Available
FaaSNet: Scalable and Fast Provisioning of Custom Serverless Container Runtimes at Alibaba Cloud Function Computing

Wang, Ao; Chang, Shuai; Tian, Huangshi; Wang, Hongqi; Yang, Haoran; Li, Huiba; Du, Rui; Cheng, Yue (January 2021, USENIX Annual Technical Conference)
null (Ed.)
Full Text Available
In Search of a Fast and Efficient Serverless DAG Engine

https://doi.org/10.1109/PDSW49588.2019.00005

Carver, Benjamin; Zhang, Jingyuan; Wang, Ao; Cheng, Yue (November 2019, 2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW))

Full Text Available
InfiniCache: Exploiting Ephemeral Serverless Functions to Build a Cost-Effective Memory Cache

Wang, Ao; Zhang, Jingyuan; Ma, Xiaolong; Anwar, Ali; Rupprecht, Lukas; Skourtis, Dimitrios; Tarasov, Vasily; Yan, Feng; Cheng, Yue (February 2020, 18th USENIX Conference on File and Storage Technologies)

Internet-scale web applications are becoming increasingly storage-intensive and rely heavily on in-memory object caching to attain required I/O performance. We argue that the emerging serverless computing paradigm provides a well-suited, cost-effective platform for object caching. We present InfiniCache, a first-of-its-kind in-memory object caching system that is completely built and deployed atop ephemeral serverless functions. InfiniCache exploits and orchestrates serverless functions' memory resources to enable elastic pay-per-use caching. InfiniCache's design combines erasure coding, intelligent billed duration control, and an efficient data backup mechanism to maximize data availability and cost-effectiveness while balancing the risk of losing cached state and performance. We implement InfiniCache on AWS Lambda and show that it: (1) achieves 31 – 96× tenant-side cost savings compared to AWS ElastiCache for a large-object-only production workload, (2) can effectively provide 95.4% data availability for each one hour window, and (3) enables comparative performance seen in a typical in-memory cache.
more » « less
Full Text Available
InfiniCache: Exploiting Ephemeral Serverless Functions to Build a Cost-Effective Memory Cache

Wang, Ao; Zhang, Jingyuan; Ma, Xiaolong; Anwar, Ali; Rupprecht, Lukas; Skourtis, Dimitrios; Tarasov, Vasily; Yan, Feng; Cheng, Yue (February 2020, The 18th USENIX Conference on File and Storage Technologies (FAST 20))

Full Text Available
INFINICACHE: Exploiting Ephemeral Serverless Functions to Build a Cost-Effective Memory Cache

Wang, Ao; Zhang, Jingyuan Zhang; Ma, Xiaolong Ma; Anwar, Ali; Rupprecht, Lukas; Skourtis, Dimitrios; Tarasov, Vasily; Yan, Feng; Cheng, Yue (February 2020, Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST ’20))

Full Text Available

« Prev Next »

Search for: All records