NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Assessing predictability of environmental time series with statistical and machine learning models

https://doi.org/10.1002/env.2864

Bonas, Matthew; Datta, Abhirup; Wikle, Christopher_K; Boone, Edward_L; Alamri, Faten_S; Hari, Bhava_Vyasa; Kavila, Indulekha; Simmons, Susan_J; Jarvis, Shannon_M; Burr, Wesley_S; et al (July 2024, Environmetrics)

Abstract The ever increasing popularity of machine learning methods in virtually all areas of science, engineering and beyond is poised to put established statistical modeling approaches into question. Environmental statistics is no exception, as popular constructs such as neural networks and decision trees are now routinely used to provide forecasts of physical processes ranging from air pollution to meteorology. This presents both challenges and opportunities to the statistical community, which could contribute to the machine learning literature with a model‐based approach with formal uncertainty quantification. Should, however, classical statistical methodologies be discarded altogether in environmental statistics, and should our contribution be focused on formalizing machine learning constructs? This work aims at providing some answers to this thought‐provoking question with two time series case studies where selected models from both the statistical and machine learning literature are compared in terms of forecasting skills, uncertainty quantification and computational time. Relative merits of both class of approaches are discussed, and broad open questions are formulated as a baseline for a discussion on the topic.
more » « less
High-resolution global precipitation downscaling with latent Gaussian models and non-stationary stochastic partial differential equation structure

https://doi.org/10.1093/jrsssc/qlad084

Zhang, Jiachen; Bonas, Matthew; Bolster, Diogo; Fuglstad, Geir-Arne; Castruccio, Stefano (September 2023, Journal of the Royal Statistical Society Series C: Applied Statistics)

Abstract Obtaining high-resolution maps of precipitation data can provide key insights to stakeholders to assess a sustainable access to water resources at urban scale. Mapping a non-stationary, sparse process such as precipitation at very high spatial resolution requires the interpolation of global datasets at the location where ground stations are available with statistical models able to capture complex non-Gaussian global space–time dependence structures. In this work, we propose a new approach based on capturing the spatially varying anisotropy of a latent Gaussian process via a locally deformed stochastic partial differential equation (SPDE) with a buffer allowing for a different spatial structure across land and sea. The finite volume approximation of the SPDE, coupled with integrated nested Laplace approximation ensures feasible Bayesian inference for tens of millions of observations. The simulation studies showcase the improved predictability of the proposed approach against stationary and no-buffer alternatives. The proposed approach is then used to yield high-resolution simulations of daily precipitation across the United States.
more » « less
Sensitivity analysis of wind energy resources with Bayesian non-Gaussian and nonstationary functional ANOVA

https://doi.org/10.1214/23-AOAS1770

Zhang, Jiachen; Crippa, Paola; Genton, Marc G; Castruccio, Stefano (March 2024, The Annals of Applied Statistics)

Full Text Available
A stochastic locally diffusive model with neural network‐based deformations for global sea surface temperature

https://doi.org/10.1002/sta4.431

Hu, Wenjing; Fuglstad, Geir‐Arne; Castruccio, Stefano (March 2022, Stat)

In this work, we propose a new approach to model large, irregularly distributed spatio‐temporal global data via a locally diffusive stochastic partial differential equation (SPDE). The proposed model assumes a local deformation of the SPDE with non‐linear dependence on the covariates through a neural network. The proposed model can be fit in a computationally efficient manner using a triangulation over the sphere and sparsity of the precision matrix, as shown in an application with a large data set of simulated multi‐decadal monthly sea surface temperature.
more » « less
Spatial modeling of mid-infrared spectral data with thermal compensation using integrated nested Laplace approximation

https://doi.org/10.1364/AO.435918

Aquino, Bernardo; Castruccio, Stefano; Gupta, Vijay; Howard, Scott (September 2021, Applied Optics)

The problem of analyzing substances using low-cost sensors with a low signal-to-noise ratio (SNR) remains challenging. Using accurate models for the spectral data is paramount for the success of any classification task. We demonstrate that the thermal compensation of sample heating and spatial variability analysis yield lower modeling errors than non-spatial modeling. Then, we obtain the inference of the spectral data probability density functions using the integrated nested Laplace approximation (INLA) on a Bayesian hierarchical model. To achieve this goal, we use the fast and user-friendly R-INLA package in $R$ for the computation. This approach allows affordable and real-time substance identification with fewer SNR sensor measurements, thereby potentially increasing throughput and lowering costs.
more » « less

Search for: All records