skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Spatial Shrinkage Via the Product Independent Gaussian Process Prior
We study the problem of sparse signal detection on a spatial domain. We propose a novel approach to model continuous signals that are sparse and piecewise-smooth as the product of independent Gaussian (PING) processes with a smooth covariance kernel. The smoothness of the PING process is ensured by the smoothness of the covariance kernels of the Gaussian components in the product, and sparsity is controlled by the number of components. The bivariate kurtosis of the PING process implies that more components in the product results in the thicker tail and sharper peak at zero. We develop an efficient computation algorithm based on spectral methods. The simulation results demonstrate superior estimation using the PING prior over Gaussian process prior for different image regressions. We apply our method to a longitudinal magnetic resonance imaging dataset to detect the regions that are affected by multiple sclerosis computation in this domain. Supplementary materials for this article are available online.  more » « less
Award ID(s):
1916208
PAR ID:
10485009
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Taylor and Francis
Date Published:
Journal Name:
Journal of Computational and Graphical Statistics
Volume:
30
Issue:
4
ISSN:
1061-8600
Page Range / eLocation ID:
1068 to 1080
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this thesis, I present a decentralized sparse Gaussian process regression (DSGPR) model with event-triggered, adaptive inducing points. This DSGPR model brings the advantages of sparse Gaussian process regression to a decentralized implementation. Being decentralized and sparse provides advantages that are ideal for multi-agent systems (MASs) performing environmental modeling. In this case, MASs need to model large amounts of information while having potential intermittent communication connections. Additionally, the model needs to correctly perform uncertainty propagation between autonomous agents and ensure high accuracy on the prediction. For the model to meet these requirements, a bounded and efficient real-time sparse Gaussian process regression (SGPR) model is needed. I improve real-time SGPR models in these regards by introducing an adaptation of the mean shift and fixed-width clustering algorithms called radial clustering. Radial clustering enables real-time SGPR models to have an adaptive number of inducing points through an efficient inducing point selection process. I show how this clustering approach scales better than other seminal Gaussian process regression (GPR) and SGPR models for real-time purposes while attaining similar prediction accuracy and uncertainty reduction performance. Furthermore, this thesis addresses common issues inherent in decentralized frameworks such as high computation costs, inter-agent message bandwidth restrictions, and data fusion integrity. These challenges are addressed in part through performing maximum consensus between local agent models which enables the MAS to gain the advantages of decentralization while keeping data fusion integrity. The inter-agent communication restrictions are addressed through the contribution of two message passing heuristics called the covariance reduction heuristic and the Bhattacharyya distance heuristic. These heuristics enable user to reduce message passing frequency and message size through the Bhattacharyya distance and properties of spatial kernels. The entire DSGPR framework is evaluated on multiple simulated random vector fields. The results show that this framework effectively estimates vector fields using multiple autonomous agents. This vector field is assumed to be a wind field; however, this framework may be applied to the estimation of other scalar or vector fields (e.g., fluids, magnetic fields, electricity, etc.). Keywords: Sparse Gaussian process regression, clustering, event-triggered, decentralized, sensor fusion, uncertainty propagation, inducing points 
    more » « less
  2. Here we consider the problem of denoising features associated to complex data, modeled as signals on a graph, via a smoothness prior. This is motivated in part by settings such as single-cell RNA where the data is very high-dimensional, but its structure can be captured via an affinity graph. This allows us to utilize ideas from graph signal processing. In particular, we present algorithms for the cases where the signal is perturbed by Gaussian noise, dropout, and uniformly distributed noise. The signals are assumed to follow a prior distribution defined in the frequency domain which favors signals which are smooth across the edges of the graph. By pairing this prior distribution with our three models of noise generation, we propose Maximum A Posteriori (M.A.P.) estimates of the true signal in the presence of noisy data and provide algorithms for computing the M.A.P. Finally, we demonstrate the algorithms’ ability to effectively restore signals from white noise on image data and from severe dropout in single-cell RNA sequence data. 
    more » « less
  3. Summary For multivariate spatial Gaussian process models, customary specifications of cross-covariance functions do not exploit relational inter-variable graphs to ensure process-level conditional independence between the variables. This is undesirable, especially in highly multivariate settings, where popular cross-covariance functions, such as multivariate Matérn functions, suffer from a curse of dimensionality as the numbers of parameters and floating-point operations scale up in quadratic and cubic order, respectively, with the number of variables. We propose a class of multivariate graphical Gaussian processes using a general construction called stitching that crafts cross-covariance functions from graphs and ensures process-level conditional independence between variables. For the Matérn family of functions, stitching yields a multivariate Gaussian process whose univariate components are Matérn Gaussian processes, and which conforms to process-level conditional independence as specified by the graphical model. For highly multivariate settings and decomposable graphical models, stitching offers massive computational gains and parameter dimension reduction. We demonstrate the utility of the graphical Matérn Gaussian process to jointly model highly multivariate spatial data using simulation examples and an application to air-pollution modelling. 
    more » « less
  4. We consider the problem of estimating the structure of an undirected weighted sparse graph underlying a set of signals, exploiting both smoothness of the signals as well as their statistics. We augment the objective function of Kalofolias (2016) which is motivated by a signal smoothness viewpoint and imposes a Laplacian constraint, with a penalized log-likelihood objective function with a lasso constraint, motivated from a statistical viewpoint. Both of these objective functions are designed for estimation of sparse graphs. An alternating direction method of multipliers (ADMM) algorithm is presented to optimize the augmented objective function. Numerical results based on synthetic data show that the proposed approach improves upon Kalofolias (2016) in estimating the inverse covariance, and improves upon graphical lasso in estimating the graph topology. We also implement an adaptive version of the proposed algorithm following adaptive lasso of Zou (2006), and empirically show that it leads to further improvement in performance. 
    more » « less
  5. null (Ed.)
    Abstract. Satellite remote sensing provides a global view to processes on Earth that has unique benefits compared to making measurements on the ground, such as global coverage and enormous data volume. The typical downsides are spatial and temporal gaps and potentially low data quality. Meaningful statistical inference from such data requires overcoming these problems and developing efficient and robust computational tools.We design and implement a computationally efficient multi-scale Gaussian process (GP) software package, satGP, geared towards remote sensing applications. The software is able to handle problems of enormous sizes and to compute marginals and sample from the random field conditioning on at least hundreds of millions of observations. This is achieved by optimizing the computation by, e.g., randomization and splitting the problem into parallel local subproblems which aggressively discard uninformative data. We describe the mean function of the Gaussian process by approximating marginals of a Markov random field (MRF). Variability around the mean is modeled with a multi-scale covariance kernel, which consists of Matérn, exponential, and periodic components. We also demonstrate how winds can be used to inform covariances locally.The covariance kernel parameters are learned by calculating an approximate marginal maximum likelihood estimate, and the validity of both the multi-scale approach and the method used to learn the kernel parameters is verified in synthetic experiments. We apply these techniques to a moderate size ozone data set produced by an atmospheric chemistry model and to the very large number of observations retrieved from the Orbiting Carbon Observatory 2 (OCO-2) satellite. The satGP software is released under an open-source license. 
    more » « less