NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Object Proxy Patterns for Accelerating Distributed Applications

https://doi.org/10.1109/TPDS.2024.3511347

Pauloski, J Gregory; Hayot-Sasson, Valerie; Ward, Logan; Brace, Alexander; Bauer, André; Chard, Kyle; Foster, Ian (February 2025, IEEE Transactions on Parallel and Distributed Systems)

Free, publicly-accessible full text available February 1, 2026
Accelerating Function-Centric Applications by Discovering, Distributing, and Retaining Reusable Context in Workflow Systems

https://doi.org/10.1145/3625549.3658663

Phung, Thanh Son; Thomas, Colin; Ward, Logan; Chard, Kyle; Thain, Douglas (June 2024, ACM)

Workflow systems provide a convenient way for users to write large-scale applications by composing independent tasks into large graphs that can be executed concurrently on high-performance clus- ters. In many newer workflow systems, tasks are often expressed as a combination of function invocations in a high-level language. Because necessary code and data are not statically known prior to execution, they must be moved into the cluster at runtime. An obvious way of doing this is to translate function invocations into self-contained executable programs and run them as usual, but this brings a hefty performance penalty: a function invocation now needs to piggyback its context with extra code and data to a remote node, and the remote node needs to take extra time to reconstruct the invocation’s context before executing it, both detrimental to lightweight short-running functions. A better solution for workflow systems is to treat functions and invocations as first-class abstractions: subsequent invocations of the same function on a worker node should only pay for the cost of context setup once and reuse the context between different invocations. The remaining problems lie in discovering, distributing, and retaining the reusable context among workers. In this paper, we discuss the rationale and design requirement of these mechanisms to support context reuse, and implement them in TaskVine, a data- intensive distributed framework and execution engine. Our results from executing a large-scale neural network inference application and a molecular design application show that treating functions and invocations as first-class abstractions reduces the execution time of the applications by 94.5% and 26.9%, respectively.
more » « less
Full Text Available
Accelerating multiscale electronic stopping power predictions with time-dependent density functional theory and machine learning

https://doi.org/10.1038/s41524-024-01374-8

Ward, Logan; Blaiszik, Ben; Lee, Cheng-Wei; Martin, Troy; Foster, Ian; Schleife, André (September 2024, npj Computational Materials)
Fine-grained accelerator partitioning for Machine Learning and Scientific Computing in Function as a Service Platform

https://doi.org/10.1145/3624062.3624238

Dhakal, Aditya; Raith, Philipp; Ward, Logan; Hong Enriquez, Rolando P.; Rattihalli, Gourav; Chard, Kyle; Foster, Ian; Milojicic, Dejan (November 2023, ACM)

Full Text Available
In-situ TEM investigation of void swelling in nickel under irradiation with analysis aided by computer vision

https://doi.org/10.1016/j.actamat.2023.119013

Chen, Wei-Ying; Mei, Zhi-Gang; Ward, Logan; Monsen, Brandon; Wen, Jianguo; Zaluzec, Nestor J.; Yacout, Abdellatif M.; Li, Meimei (August 2023, Acta Materialia)

Full Text Available
Foundry-ML - Software and Services to Simplify Accessto Machine Learning Datasets in Materials Science

https://doi.org/10.21105/joss.05467

Schmidt, KJ; Scourtas, Aristana; Ward, Logan; Wangen, Steve; Schwarting, Marcus; Darling, Isaac; Truelove, Ethan; Ambadkar, Aadit; Bose, Ribhav; Katok, Zoa; et al (January 2024, Journal of Open Source Software)

Full Text Available
Cloud Services Enable Efficient AI-Guided Simulation Workflows across Heterogeneous Resources

https://doi.org/10.1109/IPDPSW59300.2023.00018

Ward, Logan; Pauloski, J. Gregory; Hayot-Sasson, Valerie; Chard, Ryan; Babuji, Yadu; Sivaraman, Ganesh; Choudhury, Sutanay; Chard, Kyle; Thakur, Rajeev; Foster, Ian (May 2023, IEEE)
RADICAL-Pilot and Parsl: Executing Heterogeneous Workflows on HPC Platforms

https://doi.org/10.1109/WORKS56498.2022.00009

Alsaadi, Aymen; Ward, Logan; Merzky, Andre; Chard, Kyle; Foster, Ian; Jha, Shantenu; Turilli, Matteo (November 2022, 2022 IEEE/ACM Workshop on Workflows in Support of Large-Scale Science (WORKS))

Full Text Available
Not All Tasks Are Created Equal: Adaptive Resource Allocation for Heterogeneous Tasks in Dynamic Workflows

https://doi.org/10.1109/WORKS54523.2021.00008

Phung, Thanh Son; Ward, Logan; Chard, Kyle; Thain, Douglas (November 2021, WORKS Workshop on Workflows at Supercomputing)

Users running dynamic workflows in distributed systems usually have inadequate expertise to correctly size the allocation of resources (cores, memory, disk) to each task due to the difficulty in uncovering the obscure yet important correlation between tasks and their resource consumption. Thus, users typically pay little attention to this problem of allocation sizing and either simply apply an error-prone upper bound of resource allocation to all tasks, or delegate this responsibility to underlying distributed systems, resulting in substantial waste from allocated yet unused resources. In this paper, we will first show that tasks performing different work may have significantly different resource consumption. We will then show that exploiting the heterogeneity of tasks is a desirable way to reveal and predict the relationship between tasks and their resource consumption, reduce waste from resource misallocation, increase tasks' consumption efficiency, and incentivize users' cooperation. We have developed two info-aware allocation strategies capitalizing on this characteristic and will show their effectiveness through simulations on two modern applications with dynamic workflows and five synthetic datasets of resource consumption. Our results show that info-aware strategies can cut down up to 98.7% of the total waste incurred by a best-effort strategy, and increase the efficiency in resource consumption of each task on average anywhere up to 93.9%.
more » « less
Full Text Available
Proxima: accelerating the integration of machine learning in atomistic simulations

https://doi.org/10.1145/3447818.3460370

Zamora, Yuliana; Ward, Logan; Sivaraman, Ganesh; Foster, Ian; Hoffmann, Henry (June 2021, International Conference on Supercomputing)
null (Ed.)
Full Text Available

« Prev Next »

Search for: All records