Accurate prediction of parallel application performance in HPC systems is essential for efficient resource allocation and system design. Classical performance models estimate of speedup based on theoretical assumptions, but their applicability is limited by parameter estimation, data acquisition, and real-world system issues such as latency and network congestion. This paper describes performance prediction using classical performance models boosted by a trainable machine learning framework. Domain-informed machine-learning models estimate the overhead of an application for a given problem size and resource configuration as a coefficient of the estimated speedup provided by performance laws. We evaluate this approach on two HPC mini-applications and two full applications with varying patterns of computation and communication and also evaluate the prediction accuracy on runs with varying processors-per-node configurations. Our results show that this method significantly improves the accuracy of performance predictions over standard analytical models and black-box regressors, while remaining robust even with limited training data.
more »
« less
Probabilistic Analysis of Solar Cell Performance Using Gaussian Processes
This article investigates the application of machine learning-based probabilistic prediction methodologies to estimate the performance of silicon-based solar cells. The concept of confidence-bound regions is introduced and the advantages of this concept are discussed in detail. The results show that the optical and electrical performance of a photovoltaic device can be accurately estimated using Gaussian processes with accurate knowledge of the uncertainty in the prediction values. It is also shown that cell design parameters can be estimated for a desired performance metric and trained machine learning models can be deployed as a standalone application.
more »
« less
- Award ID(s):
- 1757207
- PAR ID:
- 10315779
- Date Published:
- Journal Name:
- IEEE Journal of Photovoltaics
- ISSN:
- 2156-3381
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The number of people diagnosed with advanced stages of kidney disease have been rising every year. Early detection and constant monitoring are the only minimally invasive means to prevent severe kidney damage or kidney failure. We propose a cost-effective machine learning-based testing system that can facilitate inexpensive yet accurate kidney health checks. Our proposed framework, which was developed into an iPhone application, uses a camera-based bio-sensor and state-of-the-art classical machine learning and deep learning techniques for predicting the concentration of creatinine in the sample, based on colorimetric change in the test strip. The predicted creatinine concentration is then used to classify the severity of the kidney disease as healthy, intermediate, or critical. In this article, we focus on the effectiveness of machine learning models to translate the colorimetric reaction to kidney health prediction. In this setting, we thoroughly evaluated the effectiveness of our novel proposed models against state-of-the-art classical machine learning and deep learning approaches. Additionally, we executed a number of ablation studies to measure the performance of our model when trained using different meta-parameter choices. Our evaluation results indicate that our selective partitioned regression (SPR) model, using histogram of colors-based features and a histogram gradient boosted trees underlying estimator, exhibits much better overall prediction performance compared to state-of-the-art methods. Our initial study indicates that SPR can be an effective tool for detecting the severity of kidney disease using inexpensive lateral flow assay test strips and a smart phone-based application. Additional work is needed to verify the performance of the model in various settings.more » « less
-
null (Ed.)Early run-time prediction of co-running independent applications prior to application integration becomes challenging in multi-core processors. One of the most notable causes is the interference at the main memory subsystem, which results in significant degradation in application performance and response time in comparison to standalone execution. Currently available techniques for run-time prediction like traditional cycle-accurate simulations are slow, and analytical models are not accurate and time-consuming to build. By contrast, existing machine-learning-based approaches for run-time prediction simply do not account for interference. In this paper, we use a machine learning- based approach to train a model to correlate performance data (instructions and hardware performance counters) for a set of benchmark applications between the standalone and interference scenarios. After that, the trained model is used to predict the run-time of co-running applications in interference scenarios. In general, there is no straightforward one-to-one correspondence between samples obtained from the standalone and interference scenarios due to the different run-times, i.e. execution speeds. To address this, we developed a simple yet effective sample alignment algorithm, which is a key component in transforming interference prediction into a machine learning problem. In addition, we systematically identify the subset of features that have the highest positive impact on the model performance. Our approach is demonstrated to be effective and shows an average run-time prediction error, which is as low as 0.3% and 0.1% for two co-running applications.more » « less
-
Summary Energy‐efficient scientific applications require insight into how high performance computing system features impact the applications' power and performance. This insight can result from the development of performance and power models. In this article, we use the modeling and prediction tool MuMMI (Multiple Metrics Modeling Infrastructure) and 10 machine learning methods to model and predict performance and power consumption and compare their prediction error rates. We use an algorithm‐based fault‐tolerant linear algebra code and a multilevel checkpointing fault‐tolerant heat distribution code to conduct our modeling and prediction study on the Cray XC40 Theta and IBM BG/Q Mira at Argonne National Laboratory and the Intel Haswell cluster Shepard at Sandia National Laboratories. Our experimental results show that the prediction error rates in performance and power using MuMMI are less than 10% for most cases. By utilizing the models for runtime, node power, CPU power, and memory power, we identify the most significant performance counters for potential application optimizations, and we predict theoretical outcomes of the optimizations. Based on two collected datasets, we analyze and compare the prediction accuracy in performance and power consumption using MuMMI and 10 machine learning methods.more » « less
-
The mmWave WiGig frequency band can support high throughput and low latency emerging applications. In this context, accurate prediction of channel gain enables seamless connectivity with user mobility via proactive handover and beamforming. Machine learning techniques have been widely adopted in literature for mmWave channel prediction. However, the existing techniques assume that the indoor mmWave channel follows a stationary stochastic process. This paper demonstrates that indoor WiGig mmWave channels are non-stationary where the channel’s cumulative distribution function (CDF) changes with the user’s spatio-temporal mobility. Specifically, we show significant differences in the empirical CDF of the channel gain based on the user’s mobility stage, namely, room entering, wandering, and exiting. Thus, the dynamic WiGig mmWave indoor channel suffers from concept drift that impedes the generalization ability of deep learning-based channel prediction models. Our results demonstrate that a state-of-the-art deep learning channel prediction model based on a hybrid convolutional neural network (CNN) long-short-term memory (LSTM) recurrent neural network suffers from a deterioration in the prediction accuracy by 11–68% depending on the user’s mobility stage and the model’s training. To mitigate the negative effect of concept drift and improve the generalization ability of the channel prediction model, we develop a robust deep learning model based on an ensemble strategy. Our results show that the weight average ensemble-based model maintains a stable prediction that keeps the performance deterioration below 4%.more » « less
An official website of the United States government

