The log‐Gaussian Cox process is a flexible and popular stochastic process for modeling point patterns exhibiting spatial and space‐time dependence. Model fitting requires approximation of stochastic integrals which is implemented through discretization over the domain of interest. With fine scale discretization, inference based on Markov chain Monte Carlo is computationally burdensome because of the cost of matrix decompositions and storage, such as the Cholesky, for high dimensional covariance matrices associated with latent Gaussian variables. This article addresses these computational bottlenecks by combining two recent developments: (i) a data augmentation strategy that has been proposed for space‐time Gaussian Cox processes that is based on exact Bayesian inference and does not require fine grid approximations for infinite dimensional integrals, and (ii) a recently developed family of sparsity‐inducing Gaussian processes, called nearest‐neighbor Gaussian processes, to avoid expensive matrix computations. Our inference is delivered within the fully model‐based Bayesian paradigm and does not sacrifice the richness of traditional log‐Gaussian Cox processes. We apply our method to crime event data in San Francisco and investigate the recovery of the intensity surface.
more »
« less
Geostatistical modeling of positive‐definite matrices: An application to diffusion tensor imaging
Abstract Geostatistical modeling for continuous point‐referenced data has extensively been applied to neuroimaging because it produces efficient and valid statistical inference. However, diffusion tensor imaging (DTI), a neuroimaging technique characterizing the brain's anatomical structure, produces a positive‐definite (p.d.) matrix for each voxel. Currently, only a few geostatistical models for p.d. matrices have been proposed because introducing spatial dependence among p.d. matrices properly is challenging. In this paper, we use the spatial Wishart process, a spatial stochastic process (random field), where each p.d. matrix‐variate random variable marginally follows a Wishart distribution, and spatial dependence between random matrices is induced by latent Gaussian processes. This process is valid on an uncountable collection of spatial locations and is almost‐surely continuous, leading to a reasonable way of modeling spatial dependence. Motivated by a DTI data set of cocaine users, we propose a spatial matrix‐variate regression model based on the spatial Wishart process. A problematic issue is that the spatial Wishart process has no closed‐form density function. Hence, we propose an approximation method to obtain a feasible Cholesky decomposition model, which we show to be asymptotically equivalent to the spatial Wishart process model. A local likelihood approximation method is also applied to achieve fast computation. The simulation studies and real data application demonstrate that the Cholesky decomposition process model produces reliable inference and improved performance, compared to other methods.
more »
« less
- Award ID(s):
- 1916208
- PAR ID:
- 10485006
- Publisher / Repository:
- Wiley
- Date Published:
- Journal Name:
- Biometrics
- Volume:
- 78
- Issue:
- 2
- ISSN:
- 0006-341X
- Page Range / eLocation ID:
- 548 to 559
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Spatial statistics often involves Cholesky decomposition of covariance matrices. To ensure scalability to high dimensions, several recent approximations have assumed a sparse Cholesky factor of the precision matrix. We propose a hierarchical Vecchia approximation, whose conditional-independence assumptions imply sparsity in the Cholesky factors of both the precision and the covariance matrix. This remarkable property is crucial for applications to high-dimensional spatiotemporal filtering. We present a fast and simple algorithm to compute our hierarchical Vecchia approximation, and we provide extensions to nonlinear data assimilation with non-Gaussian data based on the Laplace approximation. In several numerical comparisons, including a filtering analysis of satellite data, our methods strongly outperformed alternative approaches.more » « less
-
Abstract Gaussian process (GP) is a staple in the toolkit of a spatial statistician. Well‐documented computing roadblocks in the analysis of large geospatial datasets using GPs have now largely been mitigated via several recent statistical innovations. Nearest neighbor Gaussian process (NNGP) has emerged as one of the leading candidates for such massive‐scale geospatial analysis owing to their empirical success. This article reviews the connection of NNGP to sparse Cholesky factors of the spatial precision (inverse‐covariance) matrix. Focus of the review is on these sparse Cholesky matrices which are versatile and have recently found many diverse applications beyond the primary usage of NNGP for fast parameter estimation and prediction in the spatial (generalized) linear models. In particular, we discuss applications of sparse NNGP Cholesky matrices to address multifaceted computational issues in spatial bootstrapping, simulation of large‐scale realizations of Gaussian random fields, and extensions to nonparametric mean function estimation of a GP using random forests. We also review a sparse‐Cholesky‐based model for areal (geographically aggregated) data that addresses long‐established interpretability issues of existing areal models. Finally, we highlight some yet‐to‐be‐addressed issues of such sparse Cholesky approximations that warrant further research. This article is categorized under:Algorithms and Computational Methods > AlgorithmsAlgorithms and Computational Methods > Numerical Methodsmore » « less
-
Abstract We propose a non‐stationary spatial model based on a normal‐inverse‐Wishart framework, conditioning on a set of nearest‐neighbors. The model, called nearest‐neighbor Gaussian process with random covariance matrices is developed for both univariate and multivariate spatial settings and allows for fully flexible covariance structures that impose no stationarity or isotropic restrictions. In addition, the model can handle duplicate observations and missing data. We consider an approach based on integrating out the spatial random effects that allows fast inference for the model parameters. We also consider a full hierarchical approach that leverages the sparse structures induced by the model to perform fast Monte Carlo computations. Strong computational efficiency is achieved by leveraging the adaptive localized structure of the model that allows for a high level of parallelization. We illustrate the performance of the model with univariate and bivariate simulations, as well as with observations from two stationary satellites consisting of albedo measurements.more » « less
-
Abstract A separable covariance model can describe the among-row and among-column correlations of a random matrix and permits likelihood-based inference with a very small sample size. However, if the assumption of separability is not met, data analysis with a separable model may misrepresent important dependence patterns in the data. As a compromise between separable and unstructured covariance estimation, we decompose a covariance matrix into a separable component and a complementary ‘core’ covariance matrix. This decomposition defines a new covariance matrix decomposition that makes use of the parsimony and interpretability of a separable covariance model, yet fully describes covariance matrices that are non-separable. This decomposition motivates a new type of shrinkage estimator, obtained by appropriately shrinking the core of the sample covariance matrix, that adapts to the degree of separability of the population covariance matrix.more » « less