 Editors:
 Lee, Jonghyun; Darve, Eric F.; Kitanidis, Peter K.; Mahoney, Michael W.; Karpatne, Anuj; Farthing, Matthew W.; Hesser, Tyler
 Award ID(s):
 1835443
 Publication Date:
 NSFPAR ID:
 10308800
 Journal Name:
 Proceedings of the AAAI 2021 Spring Symposium on Combining Artificial Intelligence and Machine Learning with Physical Sciences
 ISSN:
 16130073
 Sponsoring Org:
 National Science Foundation
More Like this

Recent advances in computing algorithms and hardware have rekindled interest in developing highaccuracy, lowcost surrogate models for simulating physical systems. The idea is to replace expensive numerical integration of complex coupled partial differential equations at fine time scales performed on supercomputers, with machinelearned surrogates that efficiently and accurately forecast future system states using data sampled from the underlying system. One particularly popular technique being explored within the weather and climate modelling community is the echo state network (ESN), an attractive alternative to other wellknown deep learning architectures. Using the classical Lorenz 63 system, and the three tier multiscale Lorenz 96 system (Thornes T, Duben P, Palmer T. 2017 Q. J. R. Meteorol. Soc. 143 , 897–908. ( doi:10.1002/qj.2974 )) as benchmarks, we realize that previously studied stateoftheart ESNs operate in two distinct regimes, corresponding to low and high spectral radius (LSR/HSR) for the sparse, randomly generated, reservoir recurrence matrix. Using knowledge of the mathematical structure of the Lorenz systems along with systematic ablation and hyperparameter sensitivity analyses, we show that stateoftheart LSRESNs reduce to a polynomial regression model which we call DomainDriven Regularized Regression (D2R2). Interestingly, D2R2 is a generalization of the wellknown SINDy algorithm (Brunton SL, Proctor JL, Kutzmore »

In many mechanistic medical, biological, physical, and engineered spatiotemporal dynamic models the numerical solution of partial differential equations (PDEs), especially for diffusion, fluid flow and mechanical relaxation, can make simulations impractically slow. Biological models of tissues and organs often require the simultaneous calculation of the spatial variation of concentration of dozens of diffusing chemical species. One clinical example where rapid calculation of a diffusing field is of use is the estimation of oxygen gradients in the retina, based on imaging of the retinal vasculature, to guide surgical interventions in diabetic retinopathy. Furthermore, the ability to predict blood perfusion and oxygenation may one day guide clinical interventions in diverse settings, i.e., from stent placement in treating heart disease to BOLD fMRI interpretation in evaluating cognitive function (Xie et al., 2019 ; Lee et al., 2020 ). Since the quasisteadystate solutions required for fastdiffusing chemical species like oxygen are particularly computationally costly, we consider the use of a neural network to provide an approximate solution to the steadystate diffusion equation. Machine learning surrogates, neural networks trained to provide approximate solutions to such complicated numerical problems, can often provide speedups of several orders of magnitude compared to direct calculation. Surrogates of PDEs couldmore »

Obeid, I. ; Selesnik, I. ; Picone, J. (Ed.)The Neuronix highperformance computing cluster allows us to conduct extensive machine learning experiments on big data [1]. This heterogeneous cluster uses innovative scheduling technology, Slurm [2], that manages a network of CPUs and graphics processing units (GPUs). The GPU farm consists of a variety of processors ranging from lowend consumer grade devices such as the Nvidia GTX 970 to higherend devices such as the GeForce RTX 2080. These GPUs are essential to our research since they allow extremely computeintensive deep learning tasks to be executed on massive data resources such as the TUH EEG Corpus [2]. We use TensorFlow [3] as the core machine learning library for our deep learning systems, and routinely employ multiple GPUs to accelerate the training process. Reproducible results are essential to machine learning research. Reproducibility in this context means the ability to replicate an existing experiment – performance metrics such as error rates should be identical and floatingpoint calculations should match closely. Three examples of ways we typically expect an experiment to be replicable are: (1) The same job run on the same processor should produce the same results each time it is run. (2) A job run on a CPU and GPU should producemore »

Abstract Machine learning (ML) tools are able to learn relationships between the inputs and outputs of large complex systems directly from data. However, for timevarying systems, the predictive capabilities of ML tools degrade if the systems are no longer accurately represented by the data with which the ML models were trained. For complex systems, retraining is only possible if the changes are slow relative to the rate at which large numbers of new inputoutput training data can be noninvasively recorded. In this work, we present an approach to deep learning for timevarying systems that does not require retraining, but uses instead an adaptive feedback in the architecture of deep convolutional neural networks (CNN). The feedback is based only on available system output measurements and is applied in the encoded lowdimensional dense layers of the encoderdecoder CNNs. First, we develop an inverse model of a complex accelerator system to map output beam measurements to input beam distributions, while both the accelerator components and the unknown input beam distribution vary rapidly with time. We then demonstrate our method on experimental measurements of the input and output beam distributions of the HiRES ultrafast electron diffraction (UED) beam line at Lawrence Berkeley National Laboratory,more »

Recent work has shown that machine learning (ML) models can be trained to accurately forecast the dynamics of unknown chaotic dynamical systems. Shortterm predictions of the state evolution and longterm predictions of the statistical patterns of the dynamics (``climate'') can be produced by employing a feedback loop, whereby the model is trained to predict forward one time step, then the model output is used as input for multiple time steps. In the absence of mitigating techniques, however, this technique can result in artificially rapid error growth. In this article, we systematically examine the technique of adding noise to the ML model input during training to promote stability and improve prediction accuracy. Furthermore, we introduce Linearized MultiNoise Training (LMNT), a regularization technique that deterministically approximates the effect of many small, independent noise realizations added to the model input during training. Our case study uses reservoir computing, a machinelearning method using recurrent neural networks, to predict the spatiotemporal chaotic KuramotoSivashinsky equation. We find that reservoir computers trained with noise or with LMNT produce climate predictions that appear to be indefinitely stable and have a climate very similar to the true system, while reservoir computers trained without regularization are unstable. Compared with othermore »