skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, May 23 until 2:00 AM ET on Friday, May 24 due to maintenance. We apologize for the inconvenience.


Title: An Invitation to Sequential Monte Carlo Samplers
Statisticians often use Monte Carlo methods to approximate probability distributions, primarily with Markov chain Monte Carlo and importance sampling. Sequential Monte Carlo samplers are a class of algorithms that combine both techniques to approximate distributions of interest and their normalizing constants. These samplers originate from particle filtering for state space models and have become general and scalable sampling techniques. This article describes sequential Monte Carlo samplers and their possible implementations, arguing that they remain under-used in statistics, despite their ability to perform sequential inference and to leverage parallel processing resources among other potential benefits. Supplementary materials for this article are available online.  more » « less
Award ID(s):
1844695 1712872
NSF-PAR ID:
10339765
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Journal of the American Statistical Association
ISSN:
0162-1459
Page Range / eLocation ID:
1 to 13
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Marc'Aurelio Ranzato, Alina Beygelzimer (Ed.)
    Implementations of the exponential mechanism in differential privacy often require sampling from intractable distributions. When approximate procedures like Markov chain Monte Carlo (MCMC) are used, the end result incurs costs to both privacy and accuracy. Existing work has examined these effects asymptotically, but implementable finite sample results are needed in practice so that users can specify privacy budgets in advance and implement samplers with exact privacy guarantees. In this paper, we use tools from ergodic theory and perfect simulation to design exact finite runtime sampling algorithms for the exponential mechanism by introducing an intermediate modified target distribution using artificial atoms. We propose an additional modification of this sampling algorithm that maintains its ǫ-DP guarantee and has improved runtime at the cost of some utility. We then compare these methods in scenarios where we can explicitly calculate a δ cost (as in (ǫ, δ)-DP) incurred when using standard MCMC techniques. Much as there is a well known trade-off between privacy and utility, we demonstrate that there is also a trade-off between privacy guarantees and runtime. 
    more » « less
  2. ABSTRACT

    We introduce a novel approach to boost the efficiency of the importance nested sampling (INS) technique for Bayesian posterior and evidence estimation using deep learning. Unlike rejection-based sampling methods such as vanilla nested sampling (NS) or Markov chain Monte Carlo (MCMC) algorithms, importance sampling techniques can use all likelihood evaluations for posterior and evidence estimation. However, for efficient importance sampling, one needs proposal distributions that closely mimic the posterior distributions. We show how to combine INS with deep learning via neural network regression to accomplish this task. We also introduce nautilus, a reference open-source python implementation of this technique for Bayesian posterior and evidence estimation. We compare nautilus against popular NS and MCMC packages, including emcee, dynesty, ultranest, and pocomc, on a variety of challenging synthetic problems and real-world applications in exoplanet detection, galaxy SED fitting and cosmology. In all applications, the sampling efficiency of nautilus is substantially higher than that of all other samplers, often by more than an order of magnitude. Simultaneously, nautilus delivers highly accurate results and needs fewer likelihood evaluations than all other samplers tested. We also show that nautilus has good scaling with the dimensionality of the likelihood and is easily parallelizable to many CPUs.

     
    more » « less
  3. null (Ed.)
    Markov Chain Monte Carlo (MCMC) methods sample from unnormalized probability distributions and offer guarantees of exact sampling. However, in the continuous case, unfavorable geometry of the target distribution can greatly limit the efficiency of MCMC methods. Augmenting samplers with neural networks can potentially improve their efficiency. Previous neural network-based samplers were trained with objectives that either did not explicitly encourage exploration, or contained a term that encouraged exploration but only for well structured distributions. Here we propose to maximize proposal entropy for adapting the proposal to distributions of any shape. To optimize proposal entropy directly, we devised a neural network MCMC sampler that has a flexible and tractable proposal distribution. Specifically, our network architecture utilizes the gradient of the target distribution for generating proposals. Our model achieved significantly higher efficiency than previous neural network MCMC techniques in a variety of sampling tasks, sometimes by more than an order magnitude. Further, the sampler was demonstrated through the training of a convergent energy-based model of natural images. The adaptive sampler achieved unbiased sampling with significantly higher proposal entropy than a Langevin dynamics sample. The trained sampler also achieved better sample quality. 
    more » « less
  4. Markov chain Monte Carlo (MCMC) methods generate samples that are asymptotically distributed from a target distribution of interest as the number of iterations goes to infinity. Various theoretical results provide upper bounds on the distance between the target and marginal distribution after a fixed number of iterations. These upper bounds are on a case by case basis and typically involve intractable quantities, which limits their use for practitioners. We introduce L-lag couplings to generate computable, non-asymptotic upper bound estimates for the total variation or the Wasserstein distance of general Markov chains. We apply L-lag couplings to the tasks of (i) determining MCMC burn-in, (ii) comparing different MCMC algorithms with the same target, and (iii) comparing exact and approximate MCMC. Lastly, we (iv) assess the bias of sequential Monte Carlo and self-normalized importance samplers. 
    more » « less
  5. We investigate approximate Bayesian inference techniques for nonlinear systems described by ordinary differential equation (ODE) models. In particular, the approximations will be based on set-valued reachability analysis approaches, yielding approximate models for the posterior distribution. Nonlinear ODEs are widely used to mathematically describe physical and biological models. However, these models are often described by parameters that are not directly measurable and have an impact on the system behaviors. Often, noisy measurement data combined with physical/biological intuition serve as the means for finding appropriate values of these parameters.Our approach operates under a Bayesian framework, given prior distribution over the parameter space and noisy observations under a known sampling distribution. We explore subsets of the space of model parameters, computing bounds on the likelihood for each subset. This is performed using nonlinear set-valued reachability analysis that is made faster by means of linearization around a reference trajectory. The tiling of the parameter space can be adaptively refined to make bounds on the likelihood tighter. We evaluate our approach on a variety of nonlinear benchmarks and compare our results with Markov Chain Monte Carlo and Sequential Monte Carlo approaches.

     
    more » « less