skip to main content

Title: Approximating Likelihoods for Large Spatial Data Sets

Likelihood methods are often difficult to use with large, irregularly sited spatial data sets, owing to the computational burden. Even for Gaussian models, exact calculations of the likelihood for n observations require O(n3) operations. Since any joint density can be written as a product of conditional densities based on some ordering of the observations, one way to lessen the computations is to condition on only some of the ‘past’ observations when computing the conditional densities. We show how this approach can be adapted to approximate the restricted likelihood and we demonstrate how an estimating equations approach allows us to judge the efficacy of the resulting approximation. Previous work has suggested conditioning on those past observations that are closest to the observation whose conditional density we are approximating. Through theoretical, numerical and practical examples, we show that there can often be considerable benefit in conditioning on some distant observations as well.

more » « less
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Journal of the Royal Statistical Society Series B: Statistical Methodology
Page Range / eLocation ID:
p. 275-296
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Modeling distributions of covariates, or density estimation, is a core challenge in unsupervised learning. However, the majority of work only considers the joint distribution, which has limited utility in practical situations. A more general and useful problem is arbitrary conditional density estimation, which aims to model any possible conditional distribution over a set of covariates, reflecting the more realistic setting of inference based on prior knowledge. We propose a novel method, Arbitrary Conditioning with Energy (ACE), that can simultaneously estimate the distribution p(x_u | x_o) for all possible subsets of unobserved features x_u and observed features x_o. ACE is designed to avoid unnecessary bias and complexity — we specify densities with a highly expressive energy function and reduce the problem to only learning one-dimensional conditionals (from which more complex distributions can be recovered during inference). This results in an approach that is both simpler and higher-performing than prior methods. We show that ACE achieves state-of-the-art for arbitrary conditional likelihood estimation and data imputation on standard benchmarks. 
    more » « less
  2. This paper investigates identification in binary response models with panel data. Conditioning on sufficient statistics can sometimes lead to a conditional maximum likelihood approach that can be used to identify and estimate the parameters of interest in such models. Unfortunately it is often difficult or impossible to find such sufficient statistics, and even if it is possible, the approach sometimes leads to conditional likelihoods that do not depend on some interesting parameters. Using a range of different data generating processes, this paper calculates the identified regions for parameters in panel data logit AR(2) and logit VAR(1) models for which it is not known whether the parameters are identified or not. We find that identification might be more common than was previously thought, and that the identified regions for non-identified objects may be small enough to be empirically useful 
    more » « less
  3. Summary

    The inferential model (IM) framework provides valid prior-free probabilistic inference by focusing on predicting unobserved auxiliary variables. But, efficient IM-based inference can be challenging when the auxiliary variable is of higher dimension than the parameter. Here we show that features of the auxiliary variable are often fully observed and, in such cases, a simultaneous dimension reduction and information aggregation can be achieved by conditioning. This proposed conditioning strategy leads to efficient IM inference and casts new light on Fisher's notions of sufficiency, conditioning and also Bayesian inference. A differential-equation-driven selection of a conditional association is developed, and validity of the conditional IM is proved under some conditions. For problems that do not admit a conditional IM of the standard form, we propose a more flexible class of conditional IMs based on localization. Examples of local conditional IMs in a bivariate normal model and a normal variance components model are also given.

    more » « less
  4. Many two-level nested simulation applications involve the conditional expectation of some response variable, where the expected response is the quantity of interest, and the expectation is with respect to the inner-level random variables, conditioned on the outer-level random variables. The latter typically represent random risk factors, and risk can be quantified by estimating the probability density function (pdf) or cumulative distribution function (cdf) of the conditional expectation. Much prior work has considered a naïve estimator that uses the empirical distribution of the sample averages across the inner-level replicates. This results in a biased estimator, because the distribution of the sample averages is over-dispersed relative to the distribution of the conditional expectation when the number of inner-level replicates is finite. Whereas most prior work has focused on allocating the numbers of outer- and inner-level replicates to balance the bias/variance tradeoff, we develop a bias-corrected pdf estimator. Our approach is based on the concept of density deconvolution, which is widely used to estimate densities with noisy observations but has not previously been considered for nested simulation problems. For a fixed computational budget, the bias-corrected deconvolution estimator allows more outer-level and fewer inner-level replicates to be used, which substantially improves the efficiency of the nested simulation. 
    more » « less
  5. Abstract

    Cannibalism, once viewed as a rare or aberrant behavior, is now recognized to be widespread and to contribute broadly to the self‐regulation of many populations. Cannibalism can produce endogenous negative feedback on population growth because it is expressed as a conditional behavior, responding to the deteriorating ecological conditions that flow, directly or indirectly, from increasing densities of conspecifics. Thus, cannibalism emerges as a strongly density‐dependent source of mortality. In this synthesis, we review recent research that has revealed a rich diversity of pathways through which rising density elicits increased cannibalism, including both factors that (a) elevate the rate of dangerous encounters between conspecifics and (b) enhance the likelihood that such encounters will lead to successful cannibalistic attacks. These pathways include both features of the autecology of cannibal populations and features of interactions with other species, including food resources and pathogens. Using mathematical models, we explore the consequences of including density‐dependent cannibal attack rates on population dynamics. The conditional expression of cannibalism generally enhances stability and population regulation in single‐species models but also may increase opportunities for alternative states and prey population escape from control by cannibalistic predators.

    more » « less