null
(Ed.)
Python has become a widely used programming language for research, not only for small one-off analyses, but also for complex application pipelines running at supercomputer- scale. Modern parallel programming frameworks for Python present users with a more granular unit of management than traditional Unix processes and batch submissions: the Python function. We review the challenges involved in running native Python functions at scale, and present techniques for dynamically determining a minimal set of dependencies and for assembling a lightweight function monitor (LFM) that captures the software environment and manages resources at the granularity of single functions. We evaluate these techniques in a range of environ- ments, from campus cluster to supercomputer, and show that our advanced dependency management planning and dynamic re- source management methods provide superior performance and utilization relative to coarser-grained management approaches, achieving several-fold decrease in execution time for several large Python applications.
more »
« less
An official website of the United States government

