Abstract Neural networks (NNs) are increasingly used for data‐driven subgrid‐scale parameterizations in weather and climate models. While NNs are powerful tools for learning complex non‐linear relationships from data, there are several challenges in using them for parameterizations. Three of these challenges are (a) data imbalance related to learning rare, often large‐amplitude, samples; (b) uncertainty quantification (UQ) of the predictions to provide an accuracy indicator; and (c) generalization to other climates, for example, those with different radiative forcings. Here, we examine the performance of methods for addressing these challenges using NN‐based emulators of the Whole Atmosphere Community Climate Model (WACCM) physics‐based gravity wave (GW) parameterizations as a test case. WACCM has complex, state‐of‐the‐art parameterizations for orography‐, convection‐, and front‐driven GWs. Convection‐ and orography‐driven GWs have significant data imbalance due to the absence of convection or orography in most grid points. We address data imbalance using resampling and/or weighted loss functions, enabling the successful emulation of parameterizations for all three sources. We demonstrate that three UQ methods (Bayesian NNs, variational auto‐encoders, and dropouts) provide ensemble spreads that correspond to accuracy during testing, offering criteria for identifying when an NN gives inaccurate predictions. Finally, we show that the accuracy of these NNs decreases for a warmer climate (4 × CO2). However, their performance is significantly improved by applying transfer learning, for example, re‐training only one layer using ∼1% new data from the warmer climate. The findings of this study offer insights for developing reliable and generalizable data‐driven parameterizations for various processes, including (but not limited to) GWs.
more »
« less
Explainable Offline‐Online Training of Neural Networks for Parameterizations: A 1D Gravity Wave‐QBO Testbed in the Small‐Data Regime
Abstract There are different strategies for training neural networks (NNs) as subgrid‐scale parameterizations. Here, we use a 1D model of the quasi‐biennial oscillation (QBO) and gravity wave (GW) parameterizations as testbeds. A 12‐layer convolutional NN that predicts GW forcings for given wind profiles, when trained offline in abig‐dataregime (100‐year), produces realistic QBOs once coupled to the 1D model. In contrast, offline training of this NN in asmall‐dataregime (18‐month) yields unrealistic QBOs. However, online re‐training of just two layers of this NN using ensemble Kalman inversion and only time‐averaged QBO statistics leads to parameterizations that yield realistic QBOs. Fourier analysis of these three NNs' kernels suggests why/how re‐training works and reveals that these NNs primarily learn low‐pass, high‐pass, and a combination of band‐pass filters, potentially related to the local and non‐local dynamics in GW propagation and dissipation. These findings/strategies generally apply to data‐driven parameterizations of other climate processes.
more »
« less
- PAR ID:
- 10558227
- Publisher / Repository:
- AGU
- Date Published:
- Journal Name:
- Geophysical Research Letters
- Volume:
- 51
- Issue:
- 2
- ISSN:
- 0094-8276
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Subgrid‐scale processes, such as atmospheric gravity waves (GWs), play a pivotal role in shaping the Earth's climate but cannot be explicitly resolved in climate models due to limitations on resolution. Instead, subgrid‐scale parameterizations are used to capture their effects. Recently, machine learning (ML) has emerged as a promising approach to learn parameterizations. In this study, we explore uncertainties associated with a ML parameterization for atmospheric GWs. Focusing on the uncertainties in the training process (parametric uncertainty), we use an ensemble of neural networks to emulate an existing GW parameterization. We estimate both offline uncertainties in raw NN output and online uncertainties in climate model output, after the neural networks are coupled. We find that online parametric uncertainty contributes a significant source of uncertainty in climate model output that must be considered when introducing NN parameterizations. This uncertainty quantification provides valuable insights into the reliability and robustness of ML‐based GW parameterizations, thus advancing our understanding of their potential applications in climate modeling.more » « less
-
Yortsos, Yannis (Ed.)Abstract Transfer learning (TL), which enables neural networks (NNs) to generalize out-of-distribution via targeted re-training, is becoming a powerful tool in scientific machine learning (ML) applications such as weather/climate prediction and turbulence modeling. Effective TL requires knowing (1) how to re-train NNs? and (2) what physics are learned during TL? Here, we present novel analyses and a framework addressing (1)–(2) for a broad range of multi-scale, nonlinear, dynamical systems. Our approach combines spectral (e.g. Fourier) analyses of such systems with spectral analyses of convolutional NNs, revealing physical connections between the systems and what the NN learns (a combination of low-, high-, band-pass filters and Gabor filters). Integrating these analyses, we introduce a general framework that identifies the best re-training procedure for a given problem based on physics and NN theory. As test case, we explain the physics of TL in subgrid-scale modeling of several setups of 2D turbulence. Furthermore, these analyses show that in these cases, the shallowest convolution layers are the best to re-train, which is consistent with our physics-guided framework but is against the common wisdom guiding TL in the ML literature. Our work provides a new avenue for optimal and explainable TL, and a step toward fully explainable NNs, for wide-ranging applications in science and engineering, such as climate change modeling.more » « less
-
Abstract We present single‐column gravity wave parameterizations (GWPs) that use machine learning to emulate non‐orographic gravity wave (GW) drag and demonstrate their ability to generalize out‐of‐sample. A set of artificial neural networks (ANNs) are trained to emulate the momentum forcing from a conventional GWP in an idealized climate model, given only one view of the annual cycle and one phase of the Quasi‐Biennial Oscillation (QBO). We investigate the sensitivity of offline and online performance to the choice of input variables and complexity of the ANN. When coupled with the model, moderately complex ANNs accurately generate full cycles of the QBO. When the model is forced with enhanced CO2, its climate response with the ANN matches that generated with the physics‐based GWP. That ANNs can accurately emulate an existing scheme and generalize to new regimes given limited data suggests the potential for developing GWPs from observational estimates of GW momentum transport.more » « less
-
Abstract Subgrid processes in global climate models are represented by parameterizations which are a major source of uncertainties in simulations of climate. In recent years, it has been suggested that machine‐learning (ML) parameterizations based on high‐resolution model output data could be superior to traditional parameterizations. Currently, both traditional and ML parameterizations of subgrid processes in the atmosphere are based on a single‐column approach, which only use information from single atmospheric columns. However, single‐column parameterizations might not be ideal since certain atmospheric phenomena, such as organized convective systems, can cross multiple grid boxes and involve slantwise circulations that are not purely vertical. Here we train neural networks (NNs) using non‐local inputs spanning over 3 × 3 columns of inputs. We find that including the non‐local inputs improves the offline prediction of a range of subgrid processes. The improvement is especially notable for subgrid momentum transport and for atmospheric conditions associated with mid‐latitude fronts and convective instability. Using an interpretability method, we find that the NN improvements partly rely on using the horizontal wind divergence, and we further show that including the divergence or vertical velocity as a separate input substantially improves offline performance. However, non‐local winds continue to be useful inputs for parameterizating subgrid momentum transport even when the vertical velocity is included as an input. Overall, our results imply that the use of non‐local variables and the vertical velocity as inputs could improve the performance of ML parameterizations, and the use of these inputs should be tested in online simulations in future work.more » « less
An official website of the United States government

