skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 2014166

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract The ever increasing popularity of machine learning methods in virtually all areas of science, engineering and beyond is poised to put established statistical modeling approaches into question. Environmental statistics is no exception, as popular constructs such as neural networks and decision trees are now routinely used to provide forecasts of physical processes ranging from air pollution to meteorology. This presents both challenges and opportunities to the statistical community, which could contribute to the machine learning literature with a model‐based approach with formal uncertainty quantification. Should, however, classical statistical methodologies be discarded altogether in environmental statistics, and should our contribution be focused on formalizing machine learning constructs? This work aims at providing some answers to this thought‐provoking question with two time series case studies where selected models from both the statistical and machine learning literature are compared in terms of forecasting skills, uncertainty quantification and computational time. Relative merits of both class of approaches are discussed, and broad open questions are formulated as a baseline for a discussion on the topic. 
    more » « less
  2. Abstract Obtaining high-resolution maps of precipitation data can provide key insights to stakeholders to assess a sustainable access to water resources at urban scale. Mapping a non-stationary, sparse process such as precipitation at very high spatial resolution requires the interpolation of global datasets at the location where ground stations are available with statistical models able to capture complex non-Gaussian global space–time dependence structures. In this work, we propose a new approach based on capturing the spatially varying anisotropy of a latent Gaussian process via a locally deformed stochastic partial differential equation (SPDE) with a buffer allowing for a different spatial structure across land and sea. The finite volume approximation of the SPDE, coupled with integrated nested Laplace approximation ensures feasible Bayesian inference for tens of millions of observations. The simulation studies showcase the improved predictability of the proposed approach against stationary and no-buffer alternatives. The proposed approach is then used to yield high-resolution simulations of daily precipitation across the United States. 
    more » « less
  3. In this work, we propose a new approach to model large, irregularly distributed spatio‐temporal global data via a locally diffusive stochastic partial differential equation (SPDE). The proposed model assumes a local deformation of the SPDE with non‐linear dependence on the covariates through a neural network. The proposed model can be fit in a computationally efficient manner using a triangulation over the sphere and sparsity of the precision matrix, as shown in an application with a large data set of simulated multi‐decadal monthly sea surface temperature. 
    more » « less
  4. The problem of analyzing substances using low-cost sensors with a low signal-to-noise ratio (SNR) remains challenging. Using accurate models for the spectral data is paramount for the success of any classification task. We demonstrate that the thermal compensation of sample heating and spatial variability analysis yield lower modeling errors than non-spatial modeling. Then, we obtain the inference of the spectral data probability density functions using the integrated nested Laplace approximation (INLA) on a Bayesian hierarchical model. To achieve this goal, we use the fast and user-friendly R-INLA package in R for the computation. This approach allows affordable and real-time substance identification with fewer SNR sensor measurements, thereby potentially increasing throughput and lowering costs. 
    more » « less