HPC Digital Twins for Evaluating Scheduling Policies, Incentive Structures and their Impact on Power and Cooling

Maiterth, Matthias (ORCID:000000018698460X); Brewer, Wesley H (ORCID:0000000236393956); Kuruvella, Jaya S (ORCID:0009000016520500); Dey, Arunavo (ORCID:000000032319586X); Islam, Tanzima Z (ORCID:0000000328775871); Kabir, Rashadul (ORCID:0009000208597238); Menear, Kevin (ORCID:0009000488362387); Duplyakin, Dmitry (ORCID:0000000151320168); Patki, Tapasya (ORCID:0000000325439688); Jones, Terry (ORCID:0000000321879707); Wang, Feiyi (ORCID:0000000200991559)

doi:10.1145/3731599.3767559

Citation Details

This content will become publicly available on November 15, 2026

HPC Digital Twins for Evaluating Scheduling Policies, Incentive Structures and their Impact on Power and Cooling

Schedulers are critical for optimal resource utilization in high-performance computing. Traditional methods to evaluate schedulers are limited to post-deployment analysis, or simulators, which do not model associated infrastructure. In this work, we present the first-of-its-kind integration of scheduling and digital twins in HPC. This enables what-if studies to understand the impact of parameter configurations and scheduling decisions on the physical assets, even before deployment, or regarding changes not easily realizable in production. We (1) provide the first digital twin framework extended with scheduling capabilities, (2) integrate various top-tier HPC systems given their publicly available datasets, (3) implement extensions to integrate external scheduling simulators. Finally, we show how to (4) implement and evaluate incentive structures, as-well-as (5) evaluate machine learning based scheduling, in such novel digital-twin based meta-framework to prototype scheduling. Our work enables what-if scenarios of HPC systems to evaluate sustainability, and the impact on the simulated system. more »

Award ID(s):: 2443561

PAR ID:: 10655134

Author(s) / Creator(s):: Maiterth, Matthias; Brewer, Wesley H; Kuruvella, Jaya S; Dey, Arunavo; Islam, Tanzima Z; Kabir, Rashadul; Menear, Kevin; Duplyakin, Dmitry; Patki, Tapasya; Jones, Terry; Wang, Feiyi

Publisher / Repository:: ACM

Date Published:: 2025-11-15

Page Range / eLocation ID:: 1959 to 1969

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on November 15, 2026
Conference Paper:
https://doi.org/10.1145/3731599.3767559

More Like this