skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Machine Learning-Based Prediction of Ecosystem-Scale CO2 Flux Measurements
AmeriFlux is a network of hundreds of sites across the contiguous United States providing tower-based ecosystem-scale carbon dioxide flux measurements at 30 min temporal resolution. While geographically wide-ranging, over its existence the network has suffered from multiple issues including towers regularly ceasing operation for extended periods and a lack of standardization of measurements between sites. In this study, we use machine learning algorithms to predict CO2 flux measurements at NEON sites (a subset of Ameriflux sites), creating a model to gap-fill measurements when sites are down or replace measurements when they are incorrect. Machine learning algorithms also have the ability to generalize to new sites, potentially even those without a flux tower. We compared the performance of seven machine learning algorithms using 35 environmental drivers and site-specific variables as predictors. We found that Extreme Gradient Boosting (XGBoost) consistently produced the most accurate predictions (Root Mean Squared Error of 1.81 μmolm−2s−1, R2 of 0.86). The model showed excellent performance testing on sites that are ecologically similar to other sites (the Mid Atlantic, New England, and the Rocky Mountains), but poorer performance at sites with fewer ecological similarities to other sites in the data (Pacific Northwest, Florida, and Puerto Rico). The results show strong potential for machine learning-based models to make more skillful predictions than state-of-the-art process-based models, being able to estimate the multi-year mean carbon balance to within an error ±50 gCm−2y−1 for 29 of our 44 test sites. These results have significant implications for being able to accurately predict the carbon flux or gap-fill an extended outage at any AmeriFlux site, and for being able to quantify carbon flux in support of natural climate solutions.  more » « less
Award ID(s):
2105828
PAR ID:
10659436
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
MDPI
Date Published:
Journal Name:
Land
Volume:
14
Issue:
1
ISSN:
2073-445X
Page Range / eLocation ID:
124
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Long-term ecological data are essential for detecting impacts of climate change and other global change factors, and for making informed predictions about future change. However, long-term measurements are rarely replicated at the site level, which raises questions about their representativeness. We used a multiscale approach to evaluate the agreement of parallel observations from AmeriFlux and NEON (National Ecological Observatory Network) towers at Bartlett Experimental Forest, New Hampshire, USA. The two towers are separated by a horizontal distance of 93 m. We focused our analysis on standard meteorological variables; fluxes of CO2, sensible heat, and latent heat measured by eddy covariance; and phenology derived from PhenoCam imagery. Results suggest excellent agreement between AmeriFlux and NEON in meteorology and phenology, and good agreement in fluxes at the half-hourly scale. However, large disagreements in CO2 and latent heat fluxes occurred at the annual scale, with implications especially for the forest carbon balance. The AmeriFlux tower measurements indicate a site that is close to carbon-neutral (-8 ± 65 g C m-2 y-1, mean ± 1 SD), whereas the NEON tower measurements indicate a forest that is a carbon sink (-137 ± 10 g C m-2 y-1). Causes of this disagreement may include measurement height (26 m vs. 35 m), which resulted in different flux footprints being measured by the two towers, and differences in the flux measurement systems. Our results suggest the need for caution when attempting to merge long-term flux data from two different measurement platforms, and when using measurements from any one measurement platform to inform decision-making on issues related to carbon accounting or natural climate solutions. 
    more » « less
  2. Eddy Covariance measurements are often subject to missing values, or gaps in the data record. Methods to fill short gaps are well-established, but robustly filling gaps longer than a few weeks remains a challenge. Marginal Distribution Sampling (MDS) is a standard gap-filling method, but its effectiveness for long gaps (> 30 days) is limited. We compared the performance of a machine learning algorithm, eXtreme Gradient Boosting (XGB) against MDS, using various artificial scenarios of gap lengths and locations. We gapfilled half hourly CO2 flux from a temperate deciduous forest, Bartlett Experimental Forest, from 2010 to 2022. Whereas the standard implementation of MDS uses a narrowly-prescribed set of predictor variables, with XGB we were able to include additional variables. The Green Chromatic Coordinate (GCC), derived from PhenoCam imagery, and diffuse photosynthetic photon flux density, emerged as two of the three most important predictor variables. Compared to MDS, the root mean square error (RMSE) of XGB decreased by 9.5 %, and the R2 increased by 2.7 % in a randomized 10-fold cross validation test. XGB outperformed MDS for both day and night times across different seasons. But annual NEE integrals varied across methods, with weaker annual net carbon uptake, by -110 ± 74 g C m-2 y-1 for XGB compared to MDS (214 ± 11 g C m-2 yr-1). In artificial gap experiments, when trained using the 13-year data record, XGB reliably filled gaps, showing little change in RMSE for gaps up to 240 days. In contrast, the performance of MDS steadily decreased as gap lengths increased. MDS was unable to fill gaps longer than 2 months. In summary, XGB demonstrates excellent performance as an alternative method to MDS, providing reliable predictions for temperate deciduous forest carbon fluxes under different gap lengths and location scenarios. Implementation of XGB is facilitated by easy-to-use packages. 
    more » « less
  3. This paper describes the formation of, and initial results for, a new FLUXNET coordination network for ecosystem-scale methane (CH 4 ) measurements at 60 sites globally, organized by the Global Carbon Project in partnership with other initiatives and regional flux tower networks. The objectives of the effort are presented along with an overview of the coverage of eddy covariance (EC) CH 4 flux measurements globally, initial results comparing CH 4 fluxes across the sites, and future research directions and needs. Annual estimates of net CH 4 fluxes across sites ranged from −0.2 ± 0.02 g C m –2 yr –1 for an upland forest site to 114.9 ± 13.4 g C m –2 yr –1 for an estuarine freshwater marsh, with fluxes exceeding 40 g C m –2 yr –1 at multiple sites. Average annual soil and air temperatures were found to be the strongest predictor of annual CH 4 flux across wetland sites globally. Water table position was positively correlated with annual CH 4 emissions, although only for wetland sites that were not consistently inundated throughout the year. The ratio of annual CH 4 fluxes to ecosystem respiration increased significantly with mean site temperature. Uncertainties in annual CH 4 estimates due to gap-filling and random errors were on average ±1.6 g C m –2 yr –1 at 95% confidence, with the relative error decreasing exponentially with increasing flux magnitude across sites. Through the analysis and synthesis of a growing EC CH 4 flux database, the controls on ecosystem CH 4 fluxes can be better understood, used to inform and validate Earth system models, and reconcile differences between land surface model- and atmospheric-based estimates of CH 4 emissions. 
    more » « less
  4. This is the AmeriFlux version of the carbon flux data for the site US-Fo1 Flux Observations of Carbon from an Airborne Laboratory (FOCAL) Campaign Site 1. Site Description - This tower is locate south of Prudhoe Bay off the Dalton Highway along the Sagavanirktok (Sag) River. Landcover at the site is wetsedge (based on NSSI land cover map). During the first campaign (2013-2014), the tower location was 70.085450N; -148.570160W. It was moved to the current location (70.085050N, 148.567090W) during 2022 to 2024. 
    more » « less
  5. This is the AmeriFlux version of the carbon flux data for the site US-Fo2 Flux Observations of Carbon from an Airborne Laboratory (FOCAL) Campaign Site 2. Site Description - This tower is locate south of Prudhoe Bay off the Dalton Highway along the Sagavanirktok (Sag) River 
    more » « less