Combining Static and Dynamic Storage Management for Data Intensive Scientific Workflows

Hazekamp, Nicholas; Kremer-Herman, Nathaniel; Tovar, Benjamin; Meng, Haiyan; Choudhury, Olivia; Emrich, Scott; Thain, Douglas

doi:10.1109/TPDS.2017.2764897

Citation Details

Combining Static and Dynamic Storage Management for Data Intensive Scientific Workflows

Workflow management systems are widely used to express and execute highly parallel applications. For dataintensive workflows, storage can be the constraining resource: the number of tasks running at once must be artificially limited to not overflow the space available in the filesystem. It is all too easy for a user to dispatch a workflow which consumes all available storage and disrupts all system users. To address these issues, we present a three-tiered approach to workflow storage management: (1) A static analysis algorithm which analyzes the storage needs of a workflow before execution, giving a realistic prediction of success or failure. (2) An online storage management algorithm which accounts for the storage needed by future tasks to avoid deadlock at runtime. (3) A task containment system which limits storage consumption of individual tasks, enabling the strong guarantees of the static analysis and dynamic management algorithms. We demonstrate the application of these techniques on three complex workflows. more »

Award ID(s):: 1642409

PAR ID:: 10047183

Author(s) / Creator(s):: Hazekamp, Nicholas; Kremer-Herman, Nathaniel; Tovar, Benjamin; Meng, Haiyan; Choudhury, Olivia; Emrich, Scott; Thain, Douglas

Date Published:: 2017-10-23

Journal Name:: IEEE Transactions on Parallel and Distributed Systems

ISSN:: 1045-9219

Page Range / eLocation ID:: 1 to 1

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1109/TPDS.2017.2764897

More Like this