Lightweight Function Monitors for Fine-Grained Management in Large Scale Python Applications

Shaffer, Tim; Li, Zhuozhao; Tovar, Ben; Babuji, Yadu; Dasso, TJ; Surma, Zoe; Chard, Kyle; Foster, Ian; Thain, Douglas

doi:10.1109/IPDPS49936.2021.00088

Citation Details

Lightweight Function Monitors for Fine-Grained Management in Large Scale Python Applications

Python has become a widely used programming language for research, not only for small one-off analyses, but also for complex application pipelines running at supercomputer- scale. Modern parallel programming frameworks for Python present users with a more granular unit of management than traditional Unix processes and batch submissions: the Python function. We review the challenges involved in running native Python functions at scale, and present techniques for dynamically determining a minimal set of dependencies and for assembling a lightweight function monitor (LFM) that captures the software environment and manages resources at the granularity of single functions. We evaluate these techniques in a range of environ- ments, from campus cluster to supercomputer, and show that our advanced dependency management planning and dynamic re- source management methods provide superior performance and utilization relative to coarser-grained management approaches, achieving several-fold decrease in execution time for several large Python applications. more »

Award ID(s):: 1931348 2004932 2004894

NSF-PAR ID:: 10295246

Author(s) / Creator(s):: Shaffer, Tim; Li, Zhuozhao; Tovar, Ben; Babuji, Yadu; Dasso, TJ; Surma, Zoe; Chard, Kyle; Foster, Ian; Thain, Douglas

Date Published:: 2021-05-01

Journal Name:: IEEE International Parallel and Distributed Processing Symposium

Page Range / eLocation ID:: 786 to 796

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/IPDPS49936.2021.00088

More Like this