Integrated hydrologic models solve coupled mathematical equations that represent natural processes, including groundwater, unsaturated, and overland flow. However, these models are computationally expensive. It has been recently shown that machine leaning (ML) and deep learning (DL) in particular could be used to emulate complex physical processes in the earth system. In this study, we demonstrate how a DL model can emulate transient, threedimensional integrated hydrologic model simulations at a fraction of the computational expense. This emulator is based on a DL model previously used for modeling video dynamics, PredRNN. The emulator is trained based on physical parameters used in the original model, inputs such as hydraulic conductivity and topography, and produces spatially distributed outputs (e.g., pressure head) from which quantities such as streamflow and water table depth can be calculated. Simulation results from the emulator and ParFlow agree well with average relative biases of 0.070, 0.092, and 0.032 for streamflow, water table depth, and total water storage, respectively. Moreover, the emulator is up to 42 times faster than ParFlow. Given this promising proof of concept, our results open the door to future applications of full hydrologic model emulation, particularly at larger scales.
This content will become publicly available on May 5, 2023
A Framework for Deep Learning Emulation of Numerical Models With a Case Study in Satellite Remote Sensing
Numerical models based on physics represent the state of the art in Earth system modeling and comprise our best tools for generating insights and predictions. Despite rapid growth in computational power, the perceived need for higher model resolutions overwhelms the latest generation computers, reducing the ability of modelers to generate simulations for understanding parameter sensitivities and characterizing variability and uncertainty. Thus, surrogate models are often developed to capture the essential attributes of the fullblown numerical models. Recent successes of machine learning methods, especially deep learning (DL), across many disciplines offer the possibility that complex nonlinear connectionist representations may be able to capture the underlying complex structures and nonlinear processes in Earth systems. A difficult test for DLbased emulation, which refers to function approximation of numerical models, is to understand whether they can be comparable to traditional forms of surrogate models in terms of computational efficiency while simultaneously reproducing model results in a credible manner. A DL emulation that passes this test may be expected to perform even better than simple models with respect to capturing complex processes and spatiotemporal dependencies. Here, we examine, with a case study in satellitebased remote sensing, the hypothesis that DL approaches can credibly represent the more »
 Award ID(s):
 1735505
 Publication Date:
 NSFPAR ID:
 10336173
 Journal Name:
 IEEE Transactions on Neural Networks and Learning Systems
 Page Range or eLocationID:
 1 to 12
 ISSN:
 2162237X
 Sponsoring Org:
 National Science Foundation
More Like this


Obeid, I. ; Selesnik, I. ; Picone, J. (Ed.)The Neuronix highperformance computing cluster allows us to conduct extensive machine learning experiments on big data [1]. This heterogeneous cluster uses innovative scheduling technology, Slurm [2], that manages a network of CPUs and graphics processing units (GPUs). The GPU farm consists of a variety of processors ranging from lowend consumer grade devices such as the Nvidia GTX 970 to higherend devices such as the GeForce RTX 2080. These GPUs are essential to our research since they allow extremely computeintensive deep learning tasks to be executed on massive data resources such as the TUH EEG Corpus [2]. We use TensorFlow [3] as the core machine learning library for our deep learning systems, and routinely employ multiple GPUs to accelerate the training process. Reproducible results are essential to machine learning research. Reproducibility in this context means the ability to replicate an existing experiment – performance metrics such as error rates should be identical and floatingpoint calculations should match closely. Three examples of ways we typically expect an experiment to be replicable are: (1) The same job run on the same processor should produce the same results each time it is run. (2) A job run on a CPU and GPU should producemore »

Abstract. Land models are essential tools for understanding and predicting terrestrial processes and climate–carbon feedbacks in the Earth system, but uncertainties in their future projections are poorly understood. Improvements in physical process realism and the representation of human influence arguably make models more comparable to reality but also increase the degrees of freedom in model configuration, leading to increased parametric uncertainty in projections. In this work we design and implement a machine learning approach to globally calibrate a subset of the parameters of the Community Land Model, version 5 (CLM5) to observations of carbon and water fluxes. We focus on parameters controlling biophysical features such as surface energy balance, hydrology, and carbon uptake. We first use parameter sensitivity simulations and a combination of objective metrics including ranked global mean sensitivity to multiple output variables and nonoverlapping spatial pattern responses between parameters to narrow the parameter space and determine a subset of important CLM5 biophysical parameters for further analysis. Using a perturbed parameter ensemble, we then train a series of artificial feedforward neural networks to emulate CLM5 output given parameter values as input. We use annual mean globally aggregated spatial variability in carbon and water fluxes as our emulation and calibrationmore »

Modern digital manufacturing processes, such as additive manufacturing, are cyberphysical in nature and utilize complex, processspecific simulations for both design and manufacturing. Although computational simulations can be used to optimize these complex processes, they can take hours or daysan unreasonable cost for engineering teams leveraging iterative design processes. Hence, more rapid computational methods are necessary in areas where computation time presents a limiting factor. When existing data from historical examples is plentiful and reliable, supervised machine learning can be used to create surrogate models that can be evaluated orders of magnitude more rapidly than comparable finite element approaches. However, for applications that necessitate computationally intensive simulations, even generating the training data necessary to train a supervised machine learning model can pose a significant barrier. Unsupervised methods, such as physics informed neural networks, offer a shortcut in cases where training data is scarce or prohibitive. These novel neural networks are trained without the use of potentially expensive labels. Instead, physical principles are encoded directly into the loss function. This method substantially reduces the time required to develop a training dataset, while still achieving the evaluation speed that is typical of supervised machine learning surrogate models. We propose a new method formore »

Due to increasing volume of measurements in smart grids, surrogate based learning approaches for modeling the power grids are becoming popular. This paper uses regression based models to find the unknown state variables on power systems. Generally, to determine these states, nonlinear systems of power flow equations are solved iteratively. This study considers that the power flow problem can be modeled as an data driven type of a model. Then, the state variables, i.e., voltage magnitudes and phase angles are obtained using machine learning based approaches, namely, Extreme Learning Machine (ELM), Gaussian Process Regression (GPR), and Support Vector Regression (SVR). Several simulations are performed on the IEEE 14 and 30Bus test systems to validate surrogate based learning based models. Moreover, input data was modified with noise to simulate measurement errors. Numerical results showed that all three models can find state variables reasonably well even with measurement noise.