Exploring Function Granularity for Serverless Machine Learning Application with GPU Sharing

Hui, Xinning; Xu, Yuanchao; Shen, Xipeng

doi:10.1145/3711699

Citation Details

Exploring Function Granularity for Serverless Machine Learning Application with GPU Sharing

Recent years have witnessed increasing interest in machine learning (ML) inferences on serverless computing due to its auto-scaling and cost-effective properties. However, one critical aspect, function granularity, has been largely overlooked, limiting the potential of serverless ML. This paper explores the impact of function granularity on serverless ML, revealing its important effects on the SLO hit rates and resource costs of serverless applications. It further proposes adaptive granularity as an approach to addressing the phenomenon that no single granularity fits all applications and situations. It explores three predictive models and presents programming tools and runtime extensions to facilitate the integration of adaptive granularity into existing serverless platforms. Experiments show adaptive granularity produces up to a 29.2% improvement in SLO hit rates and up to a 24.6% reduction in resource costs over the state-of-the-art serverless ML which uses fixed granularity. more »

Award ID(s):: 2312207

PAR ID:: 10616557

Author(s) / Creator(s):: Hui, Xinning; Xu, Yuanchao; Shen, Xipeng

Publisher / Repository:: ACM

Date Published:: 2025-03-06

Journal Name:: Proceedings of the ACM on Measurement and Analysis of Computing Systems

Volume:: 9

Issue:: 1

ISSN:: 2476-1249

Page Range / eLocation ID:: 1 to 28

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript
Journal Article:
https://doi.org/10.1145/3711699

More Like this