Towards Practical, Generalizable Machine-Learning Training Pipelines to build Regression Models for Predicting Application Resource Needs on HPC Systems

Vallabhajosyula, Manikya Swathi; Ramnath, Rajiv

doi:10.1145/3491418.3535172

Citation Details

Towards Practical, Generalizable Machine-Learning Training Pipelines to build Regression Models for Predicting Application Resource Needs on HPC Systems

This paper explores the potential for cost-effectively developing generalizable and scalable machine-learning-based regression models for predicting the approximate execution time of an HPC application given its input data and parameters. This work examines: (a) to what extent models can be trained on scaled-down datasets on commodity environments and adapted to production environments, (b) to what extent models built for specific applications can generalize to other applications within a family, and (c) how the most appropriate model may change based on the type of data and its mix. As part of this work, we also describe and show the use of an automatable pipeline for generating the necessary training data and building the model. CCS Concepts: • Software and its engineering → Designing software; • Computing methodologies → Cost-sensitive learning. Additional Key Words and Phrases: automated data generation, ML, execution time, model scalability, model transferabilit more »

Award ID(s):: 2018627

PAR ID:: 10356456

Author(s) / Creator(s):: Vallabhajosyula, Manikya Swathi; Ramnath, Rajiv

Date Published:: 2022-07-08

Journal Name:: Practice and Experience in Advanced Research Computing (PEARC ’22), July 10–14, 2022, Boston, MA, USA. ACM, New York, NY, USA

Page Range / eLocation ID:: 1 to 5

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3491418.3535172

More Like this