skip to main content


This content will become publicly available on February 7, 2025

Title: Climate-invariant machine learning

Projecting climate change is a generalization problem: We extrapolate the recent past using physical models across past, present, and future climates. Current climate models require representations of processes that occur at scales smaller than model grid size, which have been the main source of model projection uncertainty. Recent machine learning (ML) algorithms hold promise to improve such process representations but tend to extrapolate poorly to climate regimes that they were not trained on. To get the best of the physical and statistical worlds, we propose a framework, termed “climate-invariant” ML, incorporating knowledge of climate processes into ML algorithms, and show that it can maintain high offline accuracy across a wide range of climate conditions and configurations in three distinct atmospheric models. Our results suggest that explicitly incorporating physical knowledge into data-driven models of Earth system processes can improve their consistency, data efficiency, and generalizability across climate regimes.

 
more » « less
Award ID(s):
1936810
NSF-PAR ID:
10493600
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
American Association for the Advancement of Science
Date Published:
Journal Name:
Science Advances
Volume:
10
Issue:
6
ISSN:
2375-2548
Page Range / eLocation ID:
eadj7250
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract. Agricultural nitrous oxide (N2O) emission accounts for a non-trivialfraction of global greenhouse gas (GHG) budget. To date, estimatingN2O fluxes from cropland remains a challenging task because the relatedmicrobial processes (e.g., nitrification and denitrification) are controlledby complex interactions among climate, soil, plant and human activities.Existing approaches such as process-based (PB) models have well-knownlimitations due to insufficient representations of the processes oruncertainties of model parameters, and due to leverage recent advances inmachine learning (ML) a new method is needed to unlock the “black box” toovercome its limitations such as low interpretability, out-of-sample failureand massive data demand. In this study, we developed a first-of-its-kindknowledge-guided machine learning model for agroecosystems (KGML-ag) byincorporating biogeophysical and chemical domain knowledge from an advanced PBmodel, ecosys, and tested it by comparing simulating daily N2O fluxes withreal observed data from mesocosm experiments. The gated recurrent unit (GRU)was used as the basis to build the model structure. To optimize the modelperformance, we have investigated a range of ideas, including (1) usinginitial values of intermediate variables (IMVs) instead of time series asmodel input to reduce data demand; (2) building hierarchical structures toexplicitly estimate IMVs for further N2O prediction; (3) using multi-tasklearning to balance the simultaneous training on multiple variables; and (4)pre-training with millions of synthetic data generated from ecosys and fine-tuningwith mesocosm observations. Six other pure ML models were developed usingthe same mesocosm data to serve as the benchmark for the KGML-ag model.Results show that KGML-ag did an excellent job in reproducing the mesocosmN2O fluxes (overall r2=0.81, and RMSE=3.6 mgNm-2d-1from cross validation). Importantly, KGML-ag always outperformsthe PB model and ML models in predicting N2O fluxes, especially forcomplex temporal dynamics and emission peaks. Besides, KGML-ag goes beyondthe pure ML models by providing more interpretable predictions as well aspinpointing desired new knowledge and data to further empower the currentKGML-ag. We believe the KGML-ag development in this study will stimulate anew body of research on interpretable ML for biogeochemistry and otherrelated geoscience processes. 
    more » « less
  2. Summary

    Earth system models must predict forest responses to global change in order to simulate future global climate, hydrology, and ecosystem dynamics. These models are increasingly adopting vegetation demographic approaches that explicitly represent tree growth, mortality, and recruitment, enabling advances in the projection of forest vulnerability and resilience, as well as evaluation with field data. To date, simulation of regeneration processes has received far less attention than simulation of processes that affect growth and mortality, in spite of their critical role maintaining forest structure, facilitating turnover in forest composition over space and time, enabling recovery from disturbance, and regulating climate‐driven range shifts. Our critical review of regeneration process representations within current Earth system vegetation demographic models reveals the need to improve parameter values and algorithms for reproductive allocation, dispersal, seed survival and germination, environmental filtering in the seedling layer, and tree regeneration strategies adapted to wind, fire, and anthropogenic disturbance regimes. These improvements require synthesis of existing data, specific field data‐collection protocols, and novel model algorithms compatible with global‐scale simulations. Vegetation demographic models offer the opportunity to more fully integrate ecological understanding into Earth system prediction; regeneration processes need to be a critical part of the effort.

     
    more » « less
  3. null (Ed.)
    Physics-based models are often used to study engineering and environmental systems. The ability to model these systems is the key to achieving our future environmental sustainability and improving the quality of human life. This article focuses on simulating lake water temperature, which is critical for understanding the impact of changing climate on aquatic ecosystems and assisting in aquatic resource management decisions. General Lake Model (GLM) is a state-of-the-art physics-based model used for addressing such problems. However, like other physics-based models used for studying scientific and engineering systems, it has several well-known limitations due to simplified representations of the physical processes being modeled or challenges in selecting appropriate parameters. While state-of-the-art machine learning models can sometimes outperform physics-based models given ample amount of training data, they can produce results that are physically inconsistent. This article proposes a physics-guided recurrent neural network model (PGRNN) that combines RNNs and physics-based models to leverage their complementary strengths and improves the modeling of physical processes. Specifically, we show that a PGRNN can improve prediction accuracy over that of physics-based models (by over 20% even with very little training data), while generating outputs consistent with physical laws. An important aspect of our PGRNN approach lies in its ability to incorporate the knowledge encoded in physics-based models. This allows training the PGRNN model using very few true observed data while also ensuring high prediction accuracy. Although we present and evaluate this methodology in the context of modeling the dynamics of temperature in lakes, it is applicable more widely to a range of scientific and engineering disciplines where physics-based (also known as mechanistic) models are used. 
    more » « less
  4. INTRODUCTION Solving quantum many-body problems, such as finding ground states of quantum systems, has far-reaching consequences for physics, materials science, and chemistry. Classical computers have facilitated many profound advances in science and technology, but they often struggle to solve such problems. Scalable, fault-tolerant quantum computers will be able to solve a broad array of quantum problems but are unlikely to be available for years to come. Meanwhile, how can we best exploit our powerful classical computers to advance our understanding of complex quantum systems? Recently, classical machine learning (ML) techniques have been adapted to investigate problems in quantum many-body physics. So far, these approaches are mostly heuristic, reflecting the general paucity of rigorous theory in ML. Although they have been shown to be effective in some intermediate-size experiments, these methods are generally not backed by convincing theoretical arguments to ensure good performance. RATIONALE A central question is whether classical ML algorithms can provably outperform non-ML algorithms in challenging quantum many-body problems. We provide a concrete answer by devising and analyzing classical ML algorithms for predicting the properties of ground states of quantum systems. We prove that these ML algorithms can efficiently and accurately predict ground-state properties of gapped local Hamiltonians, after learning from data obtained by measuring other ground states in the same quantum phase of matter. Furthermore, under a widely accepted complexity-theoretic conjecture, we prove that no efficient classical algorithm that does not learn from data can achieve the same prediction guarantee. By generalizing from experimental data, ML algorithms can solve quantum many-body problems that could not be solved efficiently without access to experimental data. RESULTS We consider a family of gapped local quantum Hamiltonians, where the Hamiltonian H ( x ) depends smoothly on m parameters (denoted by x ). The ML algorithm learns from a set of training data consisting of sampled values of x , each accompanied by a classical representation of the ground state of H ( x ). These training data could be obtained from either classical simulations or quantum experiments. During the prediction phase, the ML algorithm predicts a classical representation of ground states for Hamiltonians different from those in the training data; ground-state properties can then be estimated using the predicted classical representation. Specifically, our classical ML algorithm predicts expectation values of products of local observables in the ground state, with a small error when averaged over the value of x . The run time of the algorithm and the amount of training data required both scale polynomially in m and linearly in the size of the quantum system. Our proof of this result builds on recent developments in quantum information theory, computational learning theory, and condensed matter theory. Furthermore, under the widely accepted conjecture that nondeterministic polynomial-time (NP)–complete problems cannot be solved in randomized polynomial time, we prove that no polynomial-time classical algorithm that does not learn from data can match the prediction performance achieved by the ML algorithm. In a related contribution using similar proof techniques, we show that classical ML algorithms can efficiently learn how to classify quantum phases of matter. In this scenario, the training data consist of classical representations of quantum states, where each state carries a label indicating whether it belongs to phase A or phase B . The ML algorithm then predicts the phase label for quantum states that were not encountered during training. The classical ML algorithm not only classifies phases accurately, but also constructs an explicit classifying function. Numerical experiments verify that our proposed ML algorithms work well in a variety of scenarios, including Rydberg atom systems, two-dimensional random Heisenberg models, symmetry-protected topological phases, and topologically ordered phases. CONCLUSION We have rigorously established that classical ML algorithms, informed by data collected in physical experiments, can effectively address some quantum many-body problems. These rigorous results boost our hopes that classical ML trained on experimental data can solve practical problems in chemistry and materials science that would be too hard to solve using classical processing alone. Our arguments build on the concept of a succinct classical representation of quantum states derived from randomized Pauli measurements. Although some quantum devices lack the local control needed to perform such measurements, we expect that other classical representations could be exploited by classical ML with similarly powerful results. How can we make use of accessible measurement data to predict properties reliably? Answering such questions will expand the reach of near-term quantum platforms. Classical algorithms for quantum many-body problems. Classical ML algorithms learn from training data, obtained from either classical simulations or quantum experiments. Then, the ML algorithm produces a classical representation for the ground state of a physical system that was not encountered during training. Classical algorithms that do not learn from data may require substantially longer computation time to achieve the same task. 
    more » « less
  5. null (Ed.)
    Recent advances in computing algorithms and hardware have rekindled interest in developing high-accuracy, low-cost surrogate models for simulating physical systems. The idea is to replace expensive numerical integration of complex coupled partial differential equations at fine time scales performed on supercomputers, with machine-learned surrogates that efficiently and accurately forecast future system states using data sampled from the underlying system. One particularly popular technique being explored within the weather and climate modelling community is the echo state network (ESN), an attractive alternative to other well-known deep learning architectures. Using the classical Lorenz 63 system, and the three tier multi-scale Lorenz 96 system (Thornes T, Duben P, Palmer T. 2017 Q. J. R. Meteorol. Soc. 143 , 897–908. ( doi:10.1002/qj.2974 )) as benchmarks, we realize that previously studied state-of-the-art ESNs operate in two distinct regimes, corresponding to low and high spectral radius (LSR/HSR) for the sparse, randomly generated, reservoir recurrence matrix. Using knowledge of the mathematical structure of the Lorenz systems along with systematic ablation and hyperparameter sensitivity analyses, we show that state-of-the-art LSR-ESNs reduce to a polynomial regression model which we call Domain-Driven Regularized Regression (D2R2). Interestingly, D2R2 is a generalization of the well-known SINDy algorithm (Brunton SL, Proctor JL, Kutz JN. 2016 Proc. Natl Acad. Sci. USA 113 , 3932–3937. ( doi:10.1073/pnas.1517384113 )). We also show experimentally that LSR-ESNs (Chattopadhyay A, Hassanzadeh P, Subramanian D. 2019 ( http://arxiv.org/abs/1906.08829 )) outperform HSR ESNs (Pathak J, Hunt B, Girvan M, Lu Z, Ott E. 2018 Phys. Rev. Lett. 120 , 024102. ( doi:10.1103/PhysRevLett.120.024102 )) while D2R2 dominates both approaches. A significant goal in constructing surrogates is to cope with barriers to scaling in weather prediction and simulation of dynamical systems that are imposed by time and energy consumption in supercomputers. Inexact computing has emerged as a novel approach to helping with scaling. In this paper, we evaluate the performance of three models (LSR-ESN, HSR-ESN and D2R2) by varying the precision or word size of the computation as our inexactness-controlling parameter. For precisions of 64, 32 and 16 bits, we show that, surprisingly, the least expensive D2R2 method yields the most robust results and the greatest savings compared to ESNs. Specifically, D2R2 achieves 68 × in computational savings, with an additional 2 × if precision reductions are also employed, outperforming ESN variants by a large margin. This article is part of the theme issue ‘Machine learning for weather and climate modelling’. 
    more » « less