skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Stratification and optimal resampling for sequential Monte Carlo
Summary Sequential Monte Carlo algorithms are widely accepted as powerful computational tools for making inference with dynamical systems. A key step in sequential Monte Carlo is resampling, which plays the role of steering the algorithm towards the future dynamics. Several strategies have been used in practice, including multinomial resampling, residual resampling, optimal resampling, stratified resampling and optimal transport resampling. In one-dimensional cases, we show that optimal transport resampling is equivalent to stratified resampling on the sorted particles, and both strategies minimize the resampling variance as well as the expected squared energy distance between the original and resampled empirical distributions. For general $$d$$-dimensional cases, we show that if the particles are first sorted using the Hilbert curve, the variance of stratified resampling is $$O(m^{-(1+2/d)})$$, an improvement over the best previously known rate of $$O(m^{-(1+1/d)})$$, where $$m$$ is the number of resampled particles. We show that this improved rate is optimal for ordered stratified resampling schemes, as conjectured in Gerber et al. (2019). We also present an almost-sure bound on the Wasserstein distance between the original and Hilbert-curve-resampled empirical distributions. In light of these results, we show that for dimension $d>1$ the mean square error of sequential quasi-Monte Carlo with $$n$$ particles can be $$O(n^{-1-4/\{d(d+4)\}})$$ if Hilbert curve resampling is used and a specific low-discrepancy set is chosen. To our knowledge, this is the first known convergence rate lower than $$o(n^{-1})$$.  more » « less
Award ID(s):
1903139 2015411
PAR ID:
10346735
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Biometrika
Volume:
109
Issue:
1
ISSN:
0006-3444
Page Range / eLocation ID:
181 to 194
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In a Monte Carlo test, the observed dataset is fixed, and several resampled or permuted versions of the dataset are generated in order to test a null hypothesis that the original dataset is exchangeable with the resampled/permuted ones. Sequential Monte Carlo tests aim to save computational resources by generating these additional datasets sequentially one by one and potentially stopping early. While earlier tests yield valid inference at a particular prespecified stopping rule, our work develops a new anytime-valid Monte Carlo test that can be continuously monitored, yielding a p-value or e-value at any stopping time possibly not specified in advance. It generalizes the well-known method by Besag and Clifford, allowing it to stop at any time, but also encompasses new sequential Monte Carlo tests that tend to stop sooner under the null and alternative without compromising power. The core technical advance is the development of new test martingales for testing exchangeability against a very particular alternative based on a testing by betting technique. The proposed betting strategies are guided by the derivation of a simple log-optimal betting strategy, have closed-form expressions for the wealth process, provable guarantees on resampling risk, and display excellent power in practice. 
    more » « less
  2. We propose a fast stochastic Hamilton Monte Carlo (HMC) method, for sampling from a smooth and strongly log-concave distribution. At the core of our proposed method is a variance reduction technique inspired by the recent advance in stochastic optimization. We show that, to achieve $$\epsilon$$ accuracy in 2-Wasserstein distance, our algorithm achieves $$\tilde O\big(n+\kappa^{2}d^{1/2}/\epsilon+\kappa^{4/3}d^{1/3}n^{2/3}/\epsilon^{2/3}%\wedge\frac{\kappa^2L^{-2}d\sigma^2}{\epsilon^2} \big)$$ gradient complexity (i.e., number of component gradient evaluations), which outperforms the state-of-the-art HMC and stochastic gradient HMC methods in a wide regime. We also extend our algorithm for sampling from smooth and general log-concave distributions, and prove the corresponding gradient complexity as well. Experiments on both synthetic and real data demonstrate the superior performance of our algorithm. 
    more » « less
  3. We study the convergence rate of discretized Riemannian Hamiltonian Monte Carlo on sampling from distributions in the form of e^{−f(x)} on a convex body M ⊂ R^n. We show that for distributions in the form of e−^{a x} on a polytope with m constraints, the convergence rate of a family of commonly-used integrators is independent of ∥a∥_2 and the geometry of the polytope. In particular, the implicit midpoint method (IMM) and the generalized Leapfrog method (LM) have a mixing time of mn^3 to achieve ϵ total variation distance to the target distribution. These guarantees are based on a general bound on the convergence rate for densities of the form e^{−f(x)} in terms of parameters of the manifold and the integrator. Our theoretical guarantee complements the empirical results of our old result, which shows that RHMC with IMM can sample ill-conditioned, non-smooth and constrained distributions in very high dimension efficiently in practice. 
    more » « less
  4. Partially observable Markov decision processes (POMDPs) are a general mathematical model for sequential decision-making in stochastic environments under state uncertainty. POMDPs are often solved online, which enables the algorithm to adapt to new information in real time. Online solvers typically use bootstrap particle filters based on importance resampling for updating the belief distribution. Since directly sampling from the ideal state distribution given the latest observation and previous state is infeasible, particle filters approximate the posterior belief distribution by propagating states and adjusting weights through prediction and resampling steps. However, in practice, the importance resampling technique often leads to particle degeneracy and sample impoverishment when the state transition model poorly aligns with the posterior belief distribution, especially when the received observation is noisy. We propose an approach that constructs a sequence of bridge distributions between the state-transition and optimal distributions through iterative Monte Carlo steps, better accommodating noisy observations in online POMDP solvers. Our algorithm demonstrates significantly superior performance compared to state-of-the-art methods when evaluated across multiple challenging POMDP domains. 
    more » « less
  5. We consider the problem of finding an approximate solution to ℓ1 regression while only observing a small number of labels. Given an n×d unlabeled data matrix X, we must choose a small set of m≪n rows to observe the labels of, then output an estimate βˆ whose error on the original problem is within a 1+ε factor of optimal. We show that sampling from X according to its Lewis weights and outputting the empirical minimizer succeeds with probability 1−δ for m>O(1ε2dlogdεδ). This is analogous to the performance of sampling according to leverage scores for ℓ2 regression, but with exponentially better dependence on δ. We also give a corresponding lower bound of Ω(dε2+(d+1ε2)log1δ). 
    more » « less