skip to main content


Title: Flexible nonstationary spatiotemporal modeling of high‐frequency monitoring data
Abstract

Many physical datasets are generated by collections of instruments that make measurements at regular time intervals. For such regular monitoring data, we extend the framework of half‐spectral covariance functions to the case of nonstationarity in space and time and demonstrate that this method provides a natural and tractable way to incorporate complex behaviors into a covariance model. Further, we use this method with fully time‐domain computations to obtain bona fide maximum likelihood estimators—as opposed to using Whittle‐type likelihood approximations, for example—that can still be computed conveniently. We apply this method to very high‐frequency Doppler LIDAR vertical wind velocity measurements, demonstrating that the model can expressively capture the extreme nonstationarity of dynamics above and below the atmospheric boundary layer and, more importantly, the interaction of the process dynamics across it.

 
more » « less
NSF-PAR ID:
10237690
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Environmetrics
Volume:
32
Issue:
5
ISSN:
1180-4009
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Estimation of an unstructured covariance matrix is difficult because of the challenges posed by parameter space dimensionality and the positive‐definiteness constraint that estimates should satisfy. We consider a general nonparametric covariance estimation framework for longitudinal data using the Cholesky decomposition of a positive‐definite matrix. The covariance matrix of time‐ordered measurements is diagonalized by a lower triangular matrix with unconstrained entries that are statistically interpretable as parameters for a varying coefficient autoregressive model. Using this dual interpretation of the Cholesky decomposition and allowing for irregular sampling time points, we treat covariance estimation as bivariate smoothing and cast it in a regularization framework for desired forms of simplicity in covariance models. Viewing stationarity as a form of simplicity or parsimony in covariance, we model the varying coefficient function with components depending on time lag and its orthogonal direction separately and penalize the components that capture the nonstationarity in the fitted function. We demonstrate construction of a covariance estimator using the smoothing spline framework. Simulation studies establish the advantage of our approach over alternative estimators proposed in the longitudinal data setting. We analyze a longitudinal dataset to illustrate application of the methodology and compare our estimates to those resulting from alternative models.

    This article is categorized under:

    Data: Types and Structure > Time Series, Stochastic Processes, and Functional Data

    Statistical and Graphical Methods of Data Analysis > Nonparametric Methods

    Algorithms and Computational Methods > Maximum Likelihood Methods

     
    more » « less
  2. Summary

    We propose to model a spatio-temporal random field that has nonstationary covariance structure in both space and time domains by applying the concept of the dimension expansion method in Bornn et al. (2012). Simulations are conducted for both separable and nonseparable space-time covariance models, and the model is also illustrated with a streamflow dataset. Both simulation and data analyses show that modeling nonstationarity in both space and time can improve the predictive performance over stationary covariance models or models that are nonstationary in space but stationary in time.

     
    more » « less
  3. Abstract Motivation

    Cell function is regulated by gene regulatory networks (GRNs) defined by protein-mediated interaction between constituent genes. Despite advances in experimental techniques, we can still measure only a fraction of the processes that govern GRN dynamics. To infer the properties of GRNs using partial observation, unobserved sequential processes can be replaced with distributed time delays, yielding non-Markovian models. Inference methods based on the resulting model suffer from the curse of dimensionality.

    Results

    We develop a simulation-based Bayesian MCMC method employing an approximate likelihood for the efficient and accurate inference of GRN parameters when only some of their products are observed. We illustrate our approach using a two-step activation model: an activation signal leads to the accumulation of an unobserved regulatory protein, which triggers the expression of observed fluorescent proteins. With prior information about observed fluorescent protein synthesis, our method successfully infers the dynamics of the unobserved regulatory protein. We can estimate the delay and kinetic parameters characterizing target regulation including transcription, translation, and target searching of an unobserved protein from experimental measurements of the products of its target gene. Our method is scalable and can be used to analyze non-Markovian models with hidden components.

    Availability and implementation

    Our code is implemented in R and is freely available with a simple example data at https://github.com/Mathbiomed/SimMCMC.

     
    more » « less
  4. Abstract

    The standardized precipitation index (SPI) measures meteorological drought relative to historical climatology by normalizing accumulated precipitation. Longer record lengths improve parameter estimates, but these longer records may include signals of anthropogenic climate change and multidecadal natural climate fluctuations. Historically, climate nonstationarity has either been ignored or incorporated into the SPI using a quasi-stationary reference period, such as the WMO 30-yr period. This study introduces and evaluates a novel nonstationary SPI model based on Bayesian splines, designed to both improve parameter estimates for stationary climates and to explicitly incorporate nonstationarity. Using synthetically generated precipitation, this study directly compares the proposed Bayesian SPI model with existing SPI approaches based on maximum likelihood estimation for stationary and nonstationary climates. The proposed model not only reproduced the performance of existing SPI models but improved upon them in several key areas: reducing parameter uncertainty and noise, simultaneously modeling the likelihood of zero and positive precipitation, and capturing nonlinear trends and seasonal shifts across all parameters. Further, the fully Bayesian approach ensures all parameters have uncertainty estimates, including zero precipitation likelihood. The study notes that the zero precipitation parameter is too sensitive and could be improved in future iterations. The study concludes with an application of the proposed Bayesian nonstationary SPI model for nine gauges across a range of hydroclimate zones in the United States. Results of this experiment show that the model is stable and reproduces nonstationary patterns identified in prior studies, while also indicating new findings, particularly for the shape and zero precipitation parameters.

    Significance Statement

    We typically measure how bad a drought is by comparing it with the historical record. With long-term changes in climate or other factors, however, a typical drought today may not have been typical in the recent past. The purpose of this study is to build a model that measures drought relative to a changing climate. Our results confirm that the model is accurate and captures previously noted climate change patterns—a drier western United States, a wetter eastern United States, earlier summer weather, and more extreme wet seasons. This is significant because this model can improve drought measurement and identify recent changes in drought.

     
    more » « less
  5. Abstract

    In complex systems with multiple variables monitored at high‐frequency, variables are not only temporally autocorrelated, but they may also be nonlinearly related or exhibit nonstationarity as the inputs or operation changes. One approach to handling such variables is to detrend them prior to monitoring and then apply control charts that assume independence and stationarity to the residuals. Monitoring controlled systems is even more challenging because the control strategy seeks to maintain variables at prespecified mean levels, and to compensate, correlations among variables may change, making monitoring the covariance essential. In this paper, a vector autoregressive model (VAR) is compared with a multivariate random forest (MRF) and a neural network (NN) for detrending multivariate time series prior to monitoring the covariance of the residuals using a multivariate exponentially weighted moving average (MEWMA) control chart. Machine learning models have an advantage when the data's structure is unknown or may change. We design a novel simulation study with nonlinear, nonstationary, and autocorrelated data to compare the different detrending models and subsequent covariance monitoring. The machine learning models have superior performance for nonlinear and strongly autocorrelated data and similar performance for linear data. An illustration with data from a reverse osmosis process is given.

     
    more » « less