- NSF-PAR ID:
- 10290270
- Editor(s):
- Scalas, Enrico
- Date Published:
- Journal Name:
- PLOS ONE
- Volume:
- 16
- Issue:
- 8
- ISSN:
- 1932-6203
- Page Range / eLocation ID:
- e0256470
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Gaussian processes (GPs) are very widely used for modeling of unknown functions or surfaces in applications ranging from regression to classification to spatial processes. Although there is an increasingly vast literature on applications, methods, theory and algorithms related to GPs, the overwhelming majority of this literature focuses on the case in which the input domain corresponds to a Euclidean space. However, particularly in recent years with the increasing collection of complex data, it is commonly the case that the input domain does not have such a simple form. For example, it is common for the inputs to be restricted to a non-Euclidean manifold, a case which forms the motivation for this article. In particular, we propose a general extrinsic framework for GP modeling on manifolds, which relies on embedding of the manifold into a Euclidean space and then constructing extrinsic kernels for GPs on their images. These extrinsic Gaussian processes (eGPs) are used as prior distributions for unknown functions in Bayesian inferences. Our approach is simple and general, and we show that the eGPs inherit fine theoretical properties from GP models in Euclidean spaces. We consider applications of our models to regression and classification problems with predictors lying in a large class of manifolds, including spheres, planar shape spaces, a space of positive definite matrices, and Grassmannians. Our models can be readily used by practitioners in biological sciences for various regression and classification problems, such as disease diagnosis or detection. Our work is also likely to have impact in spatial statistics when spatial locations are on the sphere or other geometric spaces.more » « less
-
One of the most compelling features of Gaussian process (GP) regression is its ability to provide well-calibrated posterior distributions. Recent advances in inducing point methods have sped up GP marginal likelihood and posterior mean computations, leaving posterior covariance estimation and sampling as the remaining computational bottlenecks. In this paper we address these shortcomings by using the Lanczos algorithm to rapidly approximate the predictive covariance matrix. Our approach, which we refer to as LOVE (LanczOs Variance Estimates), substantially improves time and space complexity. In our experiments, LOVE computes covariances up to 2,000 times faster and draws samples 18,000 times faster than existing methods, all without sacrificing accuracy.more » « less
-
One of the most compelling features of Gaussian process (GP) regression is its ability to provide well-calibrated posterior distributions. Recent ad- vances in inducing point methods have sped up GP marginal likelihood and posterior mean computations, leaving posterior covariance estimation and sampling as the remaining computational bottlenecks. In this paper we address these shortcom- ings by using the Lanczos algorithm to rapidly ap- proximate the predictive covariance matrix. Our approach, which we refer to as LOVE (LanczOs Variance Estimates), substantially improves time and space complexity. In our experiments, LOVE computes covariances up to 2,000 times faster and draws samples 18,000 times faster than existing methods, all without sacrificing accuracy.more » « less
-
Summary Max-stable processes have proved to be useful for the statistical modelling of spatial extremes. Several families of max-stable random fields have been proposed in the literature. One such representation is based on a limit of normalized and rescaled pointwise maxima of stationary Gaussian processes that was first introduced by Kabluchko and co-workers. This paper deals with statistical inference for max-stable space–time processes that are defined in an analogous fashion. We describe pairwise likelihood estimation, where the pairwise density of the process is used to estimate the model parameters. For regular grid observations we prove strong consistency and asymptotic normality of the parameter estimates as the joint number of spatial locations and time points tends to ∞. Furthermore, we discuss extensions to irregularly spaced locations. A simulation study shows that the method proposed works well for these models.
-
In this thesis, I present a decentralized sparse Gaussian process regression (DSGPR) model with event-triggered, adaptive inducing points. This DSGPR model brings the advantages of sparse Gaussian process regression to a decentralized implementation. Being decentralized and sparse provides advantages that are ideal for multi-agent systems (MASs) performing environmental modeling. In this case, MASs need to model large amounts of information while having potential intermittent communication connections. Additionally, the model needs to correctly perform uncertainty propagation between autonomous agents and ensure high accuracy on the prediction. For the model to meet these requirements, a bounded and efficient real-time sparse Gaussian process regression (SGPR) model is needed. I improve real-time SGPR models in these regards by introducing an adaptation of the mean shift and fixed-width clustering algorithms called radial clustering. Radial clustering enables real-time SGPR models to have an adaptive number of inducing points through an efficient inducing point selection process. I show how this clustering approach scales better than other seminal Gaussian process regression (GPR) and SGPR models for real-time purposes while attaining similar prediction accuracy and uncertainty reduction performance. Furthermore, this thesis addresses common issues inherent in decentralized frameworks such as high computation costs, inter-agent message bandwidth restrictions, and data fusion integrity. These challenges are addressed in part through performing maximum consensus between local agent models which enables the MAS to gain the advantages of decentralization while keeping data fusion integrity. The inter-agent communication restrictions are addressed through the contribution of two message passing heuristics called the covariance reduction heuristic and the Bhattacharyya distance heuristic. These heuristics enable user to reduce message passing frequency and message size through the Bhattacharyya distance and properties of spatial kernels. The entire DSGPR framework is evaluated on multiple simulated random vector fields. The results show that this framework effectively estimates vector fields using multiple autonomous agents. This vector field is assumed to be a wind field; however, this framework may be applied to the estimation of other scalar or vector fields (e.g., fluids, magnetic fields, electricity, etc.). Keywords: Sparse Gaussian process regression, clustering, event-triggered, decentralized, sensor fusion, uncertainty propagation, inducing pointsmore » « less