NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

HPC Application Parameter Autotuning on Edge Devices: A Bandit Learning Approach

https://doi.org/10.1109/HiPC62374.2024.00011

Hossain, Abrar; Badawy, Abdel-Hameed A; Islam, Mohammad A; Patki, Tapasya; Ahmed, Kishwar (December 2024, IEEE)

The growing necessity for enhanced processing capabilities in edge devices with limited resources has led us to develop effective methods for improving high-performance computing (HPC) applications. In this paper, we introduce LASP (Lightweight Autotuning of Scientific Application Parameters), a novel strategy designed to address the parameter search space challenge in edge devices. Our strategy employs a multi-armed bandit (MAB) technique focused on online exploration and exploitation. Notably, LASP takes a dynamic approach, adapting seamlessly to changing environments. We tested LASP with four HPC applications: Lulesh, Kripke, Clomp, and Hypre. Its lightweight nature makes it particularly well-suited for resource-constrained edge devices. By employing the MAB framework to efficiently navigate the search space, we achieved significant performance improvements while adhering to the stringent computational limits of edge devices. Our experimental results demonstrate the effectiveness of LASP in optimizing parameter search on edge devices.
more » « less
Full Text Available
A Hands-On Approach to Teaching Parallel and Heterogeneous Computing

https://doi.org/10.1109/HiPCW63042.2024.00012

Abdurhaman, Abubeker; Singh, Arihant; Hossain, Abrar; Ahmed, Kishwar (December 2024, IEEE)

Full Text Available
Scalable HPC Job Scheduling and Resource Management in SST

https://doi.org/10.1109/WSC63780.2024.10838714

Abdurahman, Abubeker; Hossain, Abrar; Brown, Kevin A; Yoshii, Kazutomo; Ahmed, Kishwar (December 2024, IEEE)

Full Text Available
Protein Corona Formation Prediction on Engineered Nanomaterials

https://doi.org/10.1109/eIT57321.2023.10187259

Ferry, Nicholas; Ahmed, Kishwar; Tasnim, Samia (May 2023, IEEE)
Market Mechanism-Based User-in-the-Loop Scalable Power Oversubscription for HPC Systems

https://doi.org/10.1109/HPCA56546.2023.10071006

Hossen, Md Rajib; Ahmed, Kishwar; Islam, Mohammad A. (February 2023, 2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA))

Significant power consumption is one of the major challenges for current and future high-performance computing (HPC) systems. All the while, HPC systems generally remain power underutilized, making them a great candidate for applying power oversubscription to reclaim unused capacity. However, an oversubscribed HPC system may occasionally get overloaded. In this paper, we propose MPR (Market-based Power Reduction), a scalable market-based approach where users actively participate in reducing the HPC system’s power consumption to mitigate overloads. In MPR, HPC users bid to supply, in exchange for incentives, the resource reduction required for handling the overloads. Using several real-world trace-based simulations, we extensively evaluate MPR and show that, by participating in MPR, users always receive more rewards than the cost of performance loss. At the same time, the HPC manager enjoys orders of magnitude more resource gain than her incentive payoff to the users. We also demonstrate the real-world effectiveness of MPR on a prototype system.
more » « less
Full Text Available
Practical Efficient Microservice Autoscaling with QoS Assurance

https://doi.org/10.1145/3502181.3531460

Hossen, Md Rajib; Islam, Mohammad A.; Ahmed, Kishwar (June 2022, HPDC '22: Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing)

Full Text Available
Parallel Application Power and Performance Prediction Modeling Using Simulation

https://doi.org/10.1109/WSC52266.2021.9715340

Ahmed, Kishwar; Yoshii, Kazutomo; Tasnim, Samia (December 2021, 2021 Winter Simulation Conference (WSC))

High performance computing (HPC) system runs compute-intensive parallel applications requiring large number of nodes. An HPC system consists of heterogeneous computer architecture nodes, including CPUs, GPUs, field programmable gate arrays (FPGAs), etc. Power capping is a method to improve parallel application performance subject to variable power constraints. In this paper, we propose a parallel application power and performance prediction simulator. We present prediction model to predict application power and performance for unknown power-capping values considering heterogeneous computing architecture. We develop a job scheduling simulator based on parallel discrete-event simulation engine. The simulator includes a power and performance prediction model, as well as a resource allocation model. Based on real-life measurements and trace data, we show the applicability of our proposed prediction model and simulator.
more » « less
Full Text Available

Search for: All records