Abstract. Geostatistical inverse modeling (GIM) has become a common approach to estimating greenhouse gas fluxes at the Earth's surface using atmospheric observations. GIMs are unique relative to other commonly used approaches because they do not require a single emissions inventory or a bottom–up model to serve as an initial guess of the fluxes. Instead, a modeler can incorporate a wide range of environmental, economic, and/or land use data to estimate the fluxes. Traditionally, GIMs have been paired with in situ observations that number in the thousands or tens of thousands. However, the number of available atmospheric greenhouse gas observations has been increasing enormously as the number of satellites, airborne measurement campaigns, and in situ monitoring stations continues to increase. This era of prolific greenhouse gas observations presents computational and statistical challenges for inverse modeling frameworks that have traditionally been paired with a limited number of in situ monitoring sites. In this article, we discuss the challenges of estimating greenhouse gas fluxes using large atmospheric datasets with a particular focus on GIMs. We subsequently discuss several strategies for estimating the fluxes and quantifying uncertainties, strategies that are adapted from hydrology, applied math, or other academic fields and are compatible with a wide variety of atmospheric models. We further evaluate the accuracy and computational burden of each strategy using a synthetic CO2 case study based upon NASA's Orbiting Carbon Observatory 2 (OCO-2) satellite. Specifically, we simultaneously estimate a full year of 3-hourly CO2 fluxes across North America in one case study – a total of 9.4×106 unknown fluxes using 9.9×104 synthetic observations. The strategies discussed here provide accurate estimates of CO2 fluxes that are comparable to fluxes calculated directly or analytically. We are also able to approximate posterior uncertainties in the fluxes, but these approximations are, typically, an over- or underestimate depending upon the strategy employed and the degree of approximation required to make the calculations manageable.
more »
« less
Computationally efficient methods for large-scale atmospheric inverse modeling
Abstract. Atmospheric inverse modeling describes the process of estimating greenhouse gas fluxes or air pollution emissions at the Earth's surface using observations of these gases collected in the atmosphere. The launch of new satellites, the expansion of surface observation networks, and a desire for more detailed maps of surface fluxes have yielded numerous computational and statistical challenges for standard inverse modeling frameworks that were often originally designed with much smaller data sets in mind. In this article, we discuss computationally efficient methods for large-scale atmospheric inverse modeling and focus on addressing some of the main computational and practical challenges. We develop generalized hybrid projection methods, which are iterative methods for solving large-scale inverse problems, and specifically we focus on the case of estimating surface fluxes. These algorithms confer several advantages. They are efficient, in part because they converge quickly, they exploit efficient matrix–vector multiplications, and they do not require inversion of any matrices. These methods are also robust because they can accurately reconstruct surface fluxes, they are automatic since regularization or covariance matrix parameters and stopping criteria can be determined as part of the iterative algorithm, and they are flexible because they can be paired with many different types of atmospheric models. We demonstrate the benefits of generalized hybrid methods with a case study from NASA's Orbiting Carbon Observatory 2 (OCO-2) satellite. We then address the more challenging problem of solving the inverse model when the mean of the surface fluxes is not known a priori; we do so by reformulating the problem, thereby extending the applicability of hybrid projection methods to include hierarchical priors. We further show that by exploiting mathematical relations provided by the generalized hybrid method, we can efficiently calculate an approximate posterior variance, thereby providing uncertainty information.
more »
« less
- PAR ID:
- 10342801
- Date Published:
- Journal Name:
- Geoscientific Model Development
- Volume:
- 15
- Issue:
- 14
- ISSN:
- 1991-9603
- Page Range / eLocation ID:
- 5547 to 5565
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Inverse models arise in various environmental applications, ranging from atmospheric modeling to geosciences. Inverse models can often incorporate predictor variables, similar to regression, to help estimate natural processes or parameters of interest from observed data. Although a large set of possible predictor variables may be included in these inverse or regression models, a core challenge is to identify a small number of predictor variables that are most informative of the model, given limited observations. This problem is typically referred to as model selection. A variety of criterion-based approaches are commonly used for model selection, but most follow a two-step process: first, select predictors using some statistical criteria, and second, solve the inverse or regression problem with these predictor variables. The first step typically requires comparing all possible combinations of candidate predictors, which quickly becomes computationally prohibitive, especially for large-scale problems. In this work, we develop a one-step approach for linear inverse modeling, where model selection and the inverse model are performed in tandem. We reformulate the problem so that the selection of a small number of relevant predictor variables is achieved via a sparsity-promoting prior. Then, we describe hybrid iterative projection methods based on flexible Krylov subspace methods for efficient optimization. These approaches are well-suited for large-scale problems with many candidate predictor variables. We evaluate our results against traditional, criteria-based approaches. We also demonstrate the applicability and potential benefits of our approach using examples from atmospheric inverse modeling based on NASA's Orbiting Carbon Observatory-2 (OCO-2) satellite.more » « less
-
Abstract. Inverse models arise in various environmental applications, ranging from atmospheric modeling to geosciences. Inverse models can often incorporate predictor variables, similar to regression, to help estimate natural processes or parameters of interest from observed data. Although a large set of possible predictor variables may be included in these inverse or regression models, a core challenge is to identify a small number of predictor variables that are most informative of the model, given limited observations. This problem is typically referred to as model selection. A variety of criterion-based approaches are commonly used for model selection, but most follow a two-step process: first, select predictors using some statistical criteria, and second, solve the inverse or regression problem with these predictor variables. The first step typically requires comparing all possible combinations of candidate predictors, which quickly becomes computationally prohibitive, especially for large-scale problems. In this work, we develop a one-step approach, where model selection and the inverse model are performed in tandem. We reformulate the problem so that the selection of a small number of relevant predictor variables is achieved via a sparsity-promoting prior. Then, we describe hybrid iterative projection methods based on flexible Krylov subspace methods for efficient optimization. These approaches are well-suited for large-scale problems with many candidate predictor variables. We evaluate our results against traditional, criteria-based approaches. We also demonstrate the applicability and potential benefits of our approach using examples from atmospheric inverse modeling based on NASA's Orbiting Carbon Observatory 2 (OCO-2) satellite.more » « less
-
While linear discriminant analysis (LDA) is a widely used classification method, it is highly affected by outliers which commonly occur in various real datasets. Therefore, several robust LDA methods have been proposed. However, they either rely on robust estimation of the sample means and covariance matrix which may have noninvertible Hessians or can only handle binary classes or low dimensional cases. The proposed robust discriminant analysis is a multi-directional projection-pursuit approach which can classify multiple classes without estimating the covariance or Hessian matrix and work for high dimensional cases. The weight function effectively gives smaller weights to the points more deviant from the class center. The discriminant vectors and scoring vectors are solved by the proposed iterative algorithm. It inherits good properties of the weight function and multi-directional projection pursuit for reducing the influence of outliers on estimating the discriminant directions and producing robust classification which is less sensitive to outliers. We show that when a weight function is appropriately chosen, then the influence function is bounded and discriminant vectors and scoring vectors are both consistent as the percentage of outliers goes to zero. The experimental results show that the robust optimal scoring discriminant analysis is effective and efficient.more » « less
-
Abstract Far-infrared Outgoing Radiation Understanding and Monitoring (FORUM) was selected in 2019 as the ninth Earth Explorer mission by the European Space Agency. Its primary objective is to collect interferometric measurements in the far-infrared (FIR) spectral range, which accounts for 50% of Earth’s outgoing longwave radiation emitted into space, and will be observed from space for the first time. Accurate measurements of the FIR at the top of the atmosphere are crucial for improving climate models. Current instruments are insufficient, necessitating the development of advanced computational techniques. FORUM will provide unprecedented insights into key atmospheric parameters, such as surface emissivity, water vapor, and ice cloud properties, through the use of a Fourier transform spectrometer. To ensure the quality of the mission’s data, an end-to-end simulator was developed to simulate the measurement process and evaluate the effects of instrument characteristics and environmental factors. The core challenge of the mission is solving the retrieval problem, which involves estimating atmospheric properties from the radiance spectra observed by the satellite. This problem is ill-posed and regularization techniques are necessary to stabilize the solution. In this work, we present a data-driven approach to approximate the inverse mapping in the retrieval problem, aiming to achieve a solution that is both computationally efficient and accurate. In the first phase, we generate an initial approximation of the inverse mapping using only simulated FORUM data. In the second phase, we improve this approximation by introducing climatological data asa prioriinformation and using a neural network to estimate the optimal regularization parameters during the retrieval process. While our approach does not match the precision of full-physics retrieval methods, its key advantage is the ability to deliver results almost instantaneously, making it highly suitable for real-time applications. Furthermore, the proposed method can provide more accuratea prioriestimates for full-physics methods, thereby improving the overall accuracy of the retrieved atmospheric profiles.more » « less
An official website of the United States government

