Title: Geometry-informed irreversible perturbations for accelerated convergence of Langevin dynamics
Abstract We introduce a novel geometry-informed irreversible perturbation that accelerates convergence of the Langevin algorithm for Bayesian computation. It is well documented that there exist perturbations to the Langevin dynamics that preserve its invariant measure while accelerating its convergence. Irreversible perturbations and reversible perturbations (such as Riemannian manifold Langevin dynamics (RMLD)) have separately been shown to improve the performance of Langevin samplers. We consider these two perturbations simultaneously by presenting a novel form of irreversible perturbation for RMLD that is informed by the underlying geometry. Through numerical examples, we show that this new irreversible perturbation can improve estimation performance over irreversible perturbations that do not take the geometry into account. Moreover we demonstrate that irreversible perturbations generally can be implemented in conjunction with the stochastic gradient version of the Langevin algorithm. Lastly, while continuous-time irreversible perturbations cannot impair the performance of a Langevin estimator, the situation can sometimes be more complicated when discretization is considered. To this end, we describe a discrete-time example in which irreversibility increases both the bias and variance of the resulting estimator. more »« less
Chatterji, Niladri; Diakonikolas, Jelena; Jordan, Michael I.; Bartlett, Peter L.
(, Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics)
Chiappa, Silvia; Calandra, Roberto
(Ed.)
Langevin Monte Carlo (LMC) is an iterative algorithm used to generate samples from a distribution that is known only up to a normalizing constant. The nonasymptotic dependence of its mixing time on the dimension and target accuracy is understood mainly in the setting of smooth (gradient-Lipschitz) log-densities, a serious limitation for applications in machine learning. In this paper, we remove this limitation, providing polynomial-time convergence guarantees for a variant of LMC in the setting of nonsmooth log-concave distributions. At a high level, our results follow by leveraging the implicit smoothing of the log-density that comes from a small Gaussian perturbation that we add to the iterates of the algorithm and controlling the bias and variance that are induced by this perturbation.
Stochastic gradient Hamiltonian Monte Carlo (SGHMC) is a variant of stochastic gradients with momentum where a controlled and properly scaled Gaussian noise is added to the stochastic gradients to steer the iterates toward a global minimum. Many works report its empirical success in practice for solving stochastic nonconvex optimization problems; in particular, it has been observed to outperform overdamped Langevin Monte Carlo–based methods, such as stochastic gradient Langevin dynamics (SGLD), in many applications. Although the asymptotic global convergence properties of SGHMC are well known, its finite-time performance is not well understood. In this work, we study two variants of SGHMC based on two alternative discretizations of the underdamped Langevin diffusion. We provide finite-time performance bounds for the global convergence of both SGHMC variants for solving stochastic nonconvex optimization problems with explicit constants. Our results lead to nonasymptotic guarantees for both population and empirical risk minimization problems. For a fixed target accuracy level on a class of nonconvex problems, we obtain complexity bounds for SGHMC that can be tighter than those available for SGLD.
Stochastic Gradient Langevin Dynamics (SGLD) have been widely used for Bayesian sampling from certain probability distributions, incorporating derivatives of the log-posterior. With the derivative evaluation of the log-posterior distribution, SGLD methods generate samples from the distribution through performing as a thermostats dynamics that traverses over gradient flows of the log-posterior with certainly controllable perturbation. Even when the density is not known, existing solutions still can first learn the kernel density models from the given datasets, then produce new samples using the SGLD over the kernel density derivatives. In this work, instead of exploring new samples from kernel spaces, a novel SGLD sampler, namely, Randomized Measurement Langevin Dynamics (RMLD) is proposed to sample the high-dimensional sparse representations from the spectral domain of a given dataset. Specifically, given a random measurement matrix for sparse coding, RMLD first derives a novel likelihood evaluator of the probability distribution from the loss function of LASSO, then samples from the high-dimensional distribution using stochastic Langevin dynamics with derivatives of the logarithm likelihood and Metropolis–Hastings sampling. In addition, new samples in low-dimensional measuring spaces can be regenerated using the sampled high-dimensional vectors and the measurement matrix. The algorithm analysis shows that RMLD indeed projects a given dataset into a high-dimensional Gaussian distribution with Laplacian prior, then draw new sparse representation from the dataset through performing SGLD over the distribution. Extensive experiments have been conducted to evaluate the proposed algorithm using real-world datasets. The performance comparisons on three real-world applications demonstrate the superior performance of RMLD beyond baseline methods.
The technique of modifying the geometry of a problem from Euclidean to Hessian metric has proved to be quite effective in optimization, and has been the subject of study for sampling. The Mirror Langevin Diffusion (MLD) is a sampling analogue of mirror flow in continuous time, and it has nice convergence properties under log-Sobolev or Poincare inequalities relative to the Hessian metric. In discrete time, a simple discretization of MLD is the Mirror Langevin Algorithm (MLA), which was shown to have a biased convergence guarantee with a non-vanishing bias term (does not go to zero as step size goes to zero). This raised the question of whether we need a better analysis or a better discretization to achieve a vanishing bias. Here we study the Mirror Langevin Algorithm and show it indeed has a vanishing bias. We apply mean-square analysis to show the mixing time bound for MLA under the modified self-concordance condition.
Cui, Tiangang; Dong, Jing; Jasra, Ajay; Tong, Xin T
(, Advances in Applied Probability)
Abstract When implementing Markov Chain Monte Carlo (MCMC) algorithms, perturbation caused by numerical errors is sometimes inevitable. This paper studies how the perturbation of MCMC affects the convergence speed and approximation accuracy. Our results show that when the original Markov chain converges to stationarity fast enough and the perturbed transition kernel is a good approximation to the original transition kernel, the corresponding perturbed sampler has fast convergence speed and high approximation accuracy as well. Our convergence analysis is conducted under either the Wasserstein metric or the$$\chi^2$$metric, both are widely used in the literature. The results can be extended to obtain non-asymptotic error bounds for MCMC estimators. We demonstrate how to apply our convergence and approximation results to the analysis of specific sampling algorithms, including Random walk Metropolis, Metropolis adjusted Langevin algorithm with perturbed target densities, and parallel tempering Monte Carlo with perturbed densities. Finally, we present some simple numerical examples to verify our theoretical claims.
Zhang, Benjamin J., Marzouk, Youssef M., and Spiliopoulos, Konstantinos. Geometry-informed irreversible perturbations for accelerated convergence of Langevin dynamics. Statistics and Computing 32.5 Web. doi:10.1007/s11222-022-10147-6.
Zhang, Benjamin J., Marzouk, Youssef M., & Spiliopoulos, Konstantinos. Geometry-informed irreversible perturbations for accelerated convergence of Langevin dynamics. Statistics and Computing, 32 (5). https://doi.org/10.1007/s11222-022-10147-6
Zhang, Benjamin J., Marzouk, Youssef M., and Spiliopoulos, Konstantinos.
"Geometry-informed irreversible perturbations for accelerated convergence of Langevin dynamics". Statistics and Computing 32 (5). Country unknown/Code not available: Springer Science + Business Media. https://doi.org/10.1007/s11222-022-10147-6.https://par.nsf.gov/biblio/10372163.
@article{osti_10372163,
place = {Country unknown/Code not available},
title = {Geometry-informed irreversible perturbations for accelerated convergence of Langevin dynamics},
url = {https://par.nsf.gov/biblio/10372163},
DOI = {10.1007/s11222-022-10147-6},
abstractNote = {Abstract We introduce a novel geometry-informed irreversible perturbation that accelerates convergence of the Langevin algorithm for Bayesian computation. It is well documented that there exist perturbations to the Langevin dynamics that preserve its invariant measure while accelerating its convergence. Irreversible perturbations and reversible perturbations (such as Riemannian manifold Langevin dynamics (RMLD)) have separately been shown to improve the performance of Langevin samplers. We consider these two perturbations simultaneously by presenting a novel form of irreversible perturbation for RMLD that is informed by the underlying geometry. Through numerical examples, we show that this new irreversible perturbation can improve estimation performance over irreversible perturbations that do not take the geometry into account. Moreover we demonstrate that irreversible perturbations generally can be implemented in conjunction with the stochastic gradient version of the Langevin algorithm. Lastly, while continuous-time irreversible perturbations cannot impair the performance of a Langevin estimator, the situation can sometimes be more complicated when discretization is considered. To this end, we describe a discrete-time example in which irreversibility increases both the bias and variance of the resulting estimator.},
journal = {Statistics and Computing},
volume = {32},
number = {5},
publisher = {Springer Science + Business Media},
author = {Zhang, Benjamin J. and Marzouk, Youssef M. and Spiliopoulos, Konstantinos},
}
Warning: Leaving National Science Foundation Website
You are now leaving the National Science Foundation website to go to a non-government website.
Website:
NSF takes no responsibility for and exercises no control over the views expressed or the accuracy of
the information contained on this site. Also be aware that NSF's privacy policy does not apply to this site.