- Award ID(s):
- 1757207
- NSF-PAR ID:
- 10315779
- Date Published:
- Journal Name:
- IEEE Journal of Photovoltaics
- ISSN:
- 2156-3381
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
The number of people diagnosed with advanced stages of kidney disease have been rising every year. Early detection and constant monitoring are the only minimally invasive means to prevent severe kidney damage or kidney failure. We propose a cost-effective machine learning-based testing system that can facilitate inexpensive yet accurate kidney health checks. Our proposed framework, which was developed into an iPhone application, uses a camera-based bio-sensor and state-of-the-art classical machine learning and deep learning techniques for predicting the concentration of creatinine in the sample, based on colorimetric change in the test strip. The predicted creatinine concentration is then used to classify the severity of the kidney disease as healthy, intermediate, or critical. In this article, we focus on the effectiveness of machine learning models to translate the colorimetric reaction to kidney health prediction. In this setting, we thoroughly evaluated the effectiveness of our novel proposed models against state-of-the-art classical machine learning and deep learning approaches. Additionally, we executed a number of ablation studies to measure the performance of our model when trained using different meta-parameter choices. Our evaluation results indicate that our selective partitioned regression (SPR) model, using histogram of colors-based features and a histogram gradient boosted trees underlying estimator, exhibits much better overall prediction performance compared to state-of-the-art methods. Our initial study indicates that SPR can be an effective tool for detecting the severity of kidney disease using inexpensive lateral flow assay test strips and a smart phone-based application. Additional work is needed to verify the performance of the model in various settings.more » « less
-
null (Ed.)Early run-time prediction of co-running independent applications prior to application integration becomes challenging in multi-core processors. One of the most notable causes is the interference at the main memory subsystem, which results in significant degradation in application performance and response time in comparison to standalone execution. Currently available techniques for run-time prediction like traditional cycle-accurate simulations are slow, and analytical models are not accurate and time-consuming to build. By contrast, existing machine-learning-based approaches for run-time prediction simply do not account for interference. In this paper, we use a machine learning- based approach to train a model to correlate performance data (instructions and hardware performance counters) for a set of benchmark applications between the standalone and interference scenarios. After that, the trained model is used to predict the run-time of co-running applications in interference scenarios. In general, there is no straightforward one-to-one correspondence between samples obtained from the standalone and interference scenarios due to the different run-times, i.e. execution speeds. To address this, we developed a simple yet effective sample alignment algorithm, which is a key component in transforming interference prediction into a machine learning problem. In addition, we systematically identify the subset of features that have the highest positive impact on the model performance. Our approach is demonstrated to be effective and shows an average run-time prediction error, which is as low as 0.3% and 0.1% for two co-running applications.more » « less
-
Summary Energy‐efficient scientific applications require insight into how high performance computing system features impact the applications' power and performance. This insight can result from the development of performance and power models. In this article, we use the modeling and prediction tool MuMMI (Multiple Metrics Modeling Infrastructure) and 10 machine learning methods to model and predict performance and power consumption and compare their prediction error rates. We use an algorithm‐based fault‐tolerant linear algebra code and a multilevel checkpointing fault‐tolerant heat distribution code to conduct our modeling and prediction study on the Cray XC40 Theta and IBM BG/Q Mira at Argonne National Laboratory and the Intel Haswell cluster Shepard at Sandia National Laboratories. Our experimental results show that the prediction error rates in performance and power using MuMMI are less than 10% for most cases. By utilizing the models for runtime, node power, CPU power, and memory power, we identify the most significant performance counters for potential application optimizations, and we predict theoretical outcomes of the optimizations. Based on two collected datasets, we analyze and compare the prediction accuracy in performance and power consumption using MuMMI and 10 machine learning methods.
-
The mmWave WiGig frequency band can support high throughput and low latency emerging applications. In this context, accurate prediction of channel gain enables seamless connectivity with user mobility via proactive handover and beamforming. Machine learning techniques have been widely adopted in literature for mmWave channel prediction. However, the existing techniques assume that the indoor mmWave channel follows a stationary stochastic process. This paper demonstrates that indoor WiGig mmWave channels are non-stationary where the channel’s cumulative distribution function (CDF) changes with the user’s spatio-temporal mobility. Specifically, we show significant differences in the empirical CDF of the channel gain based on the user’s mobility stage, namely, room entering, wandering, and exiting. Thus, the dynamic WiGig mmWave indoor channel suffers from concept drift that impedes the generalization ability of deep learning-based channel prediction models. Our results demonstrate that a state-of-the-art deep learning channel prediction model based on a hybrid convolutional neural network (CNN) long-short-term memory (LSTM) recurrent neural network suffers from a deterioration in the prediction accuracy by 11–68% depending on the user’s mobility stage and the model’s training. To mitigate the negative effect of concept drift and improve the generalization ability of the channel prediction model, we develop a robust deep learning model based on an ensemble strategy. Our results show that the weight average ensemble-based model maintains a stable prediction that keeps the performance deterioration below 4%.more » « less
-
Operational networks commonly rely on machine learning models for many tasks, including detecting anomalies, inferring application performance, and forecasting demand. Yet, model accuracy can degrade due to concept drift, whereby the relationship between the features and the target to be predicted changes. Mitigating concept drift is an essential part of operationalizing machine learning models in general, but is of particular importance in networking's highly dynamic deployment environments. In this paper, we first characterize concept drift in a large cellular network for a major metropolitan area in the United States. We find that concept drift occurs across many important key performance indicators (KPIs), independently of the model, training set size, and time interval---thus necessitating practical approaches to detect, explain, and mitigate it. We then show that frequent model retraining with newly available data is not sufficient to mitigate concept drift, and can even degrade model accuracy further. Finally, we develop a new methodology for concept drift mitigation, Local Error Approximation of Features (LEAF). LEAF works by detecting drift; explaining the features and time intervals that contribute the most to drift; and mitigates it using forgetting and over-sampling. We evaluate LEAF against industry-standard mitigation approaches (notably, periodic retraining) with more than four years of cellular KPI data. Our initial tests with a major cellular provider in the US show that LEAF consistently outperforms periodic and triggered retraining on complex, real-world data while reducing costly retraining operations.