Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract Accurate and cost-effective quantification of the carbon cycle for agroecosystems at decision-relevant scales is critical to mitigating climate change and ensuring sustainable food production. However, conventional process-based or data-driven modeling approaches alone have large prediction uncertainties due to the complex biogeochemical processes to model and the lack of observations to constrain many key state and flux variables. Here we propose a Knowledge-Guided Machine Learning (KGML) framework that addresses the above challenges by integrating knowledge embedded in a process-based model, high-resolution remote sensing observations, and machine learning (ML) techniques. Using the U.S. Corn Belt as a testbed, we demonstrate that KGML can outperform conventional process-based and black-box ML models in quantifying carbon cycle dynamics. Our high-resolution approach quantitatively reveals 86% more spatial detail of soil organic carbon changes than conventional coarse-resolution approaches. Moreover, we outline a protocol for improving KGML via various paths, which can be generalized to develop hybrid models to better predict complex earth system dynamics.more » « less
-
Terrestrial ecosystems constitute a major component of the global carbon sink and play a critical role in regulating the global carbon cycle. Although process-based models such as the Ecosystem Demography (ED) model are widely used to simulate these dynamics and widely adopted in research and applications, they remain computationally intensive and are not well suited for large-scale (e.g., global) projections at high spatial and temporal resolution, or under wide-range of future scenarios. AI-based emulators of process-based physical models have emerged as promising ways to accelerate the computation. However, there are several challenges in developing emulators for ecosystem processes, including error accumulation over long sequences, single-step initial conditions, and high-dimensional environmental conditions. Existing works often rely on time-series patterns in look-back windows, which are not well-suited for the problem with single-step initial conditions. Moreover, they often do not consider uncertainty, making it hard to know when the approximations are highly confident and when the results may need to be updated, e.g., by the process-based models. To address these limitations, we introduce EcoDiffusion, a conditional diffusion framework tailored for ecosystem dynamics emulation. We evaluated EcoDiffusion at locations distributed worldwide under different scenarios and showed that it demonstrated significant improvements over existing models.more » « less
-
Environmental modeling faces critical challenges in predicting ecosystem dynamics across unmonitored regions due to limited and geographically imbalanced observation data. This challenge is compounded by spatial heterogeneity, causing models to learn spurious patterns that fit only local data. Unlike conventional domain generalization, environmental modeling must preserve invariant physical relationships and temporal coherence during augmentation. In this paper, we introduce Generalizable Representation Enhancement via Auxiliary Transformations (GREAT), a framework that effectively augments available datasets to improve predictions in completely unseen regions. GREAT guides the augmentation process to ensure that the original governing processes can be recovered from the augmented data, and the inclusion of the augmented data leads to improved model generalization. Specifically, GREAT learns transformation functions at multiple layers of neural networks to augment both raw environmental features and temporal influence. They are refined through a novel bi-level training process that constrains augmented data to preserve key patterns of the original source data. We demonstrate GREAT's effectiveness on stream temperature prediction across six ecologically diverse watersheds in the eastern U.S., each containing multiple stream segments. Experimental results show that GREAT significantly outperforms existing methods in zero-shot scenarios. This work provides a practical solution for environmental applications where comprehensive monitoring is infeasible.more » « less
An official website of the United States government

Full Text Available