We propose a fast stochastic Hamilton Monte Carlo (HMC) method, for sampling from a smooth and strongly logconcave distribution. At the core of our proposed method is a variance reduction technique inspired by the recent advance in stochastic optimization. We show that, to achieve $\epsilon$ accuracy in 2Wasserstein distance, our algorithm achieves $\tilde O\big(n+\kappa^{2}d^{1/2}/\epsilon+\kappa^{4/3}d^{1/3}n^{2/3}/\epsilon^{2/3}%\wedge\frac{\kappa^2L^{2}d\sigma^2}{\epsilon^2} \big)$ gradient complexity (i.e., number of component gradient evaluations), which outperforms the stateoftheart HMC and stochastic gradient HMC methods in a wide regime. We also extend our algorithm for sampling from smooth and general logconcave distributions, and prove the corresponding gradient complexity as well. Experiments onmore »
This content will become publicly available on July 7, 2022
Algorithm and Hardware CoDesign for FPGA Acceleration of Hamiltonian Monte Carlo Based NoUTurn Sampler
Monte Carlo (MC) methods are widely used in many research areas such as physical simulation, statistical analysis, and machine learning. Application of MC methods requires drawing fast mixing samples from a given probability distribution. Among existing sampling methods, the Hamiltonian Monte Carlo (HMC) utilizes gradient information during Hamiltonian simulation and can produce fast mixing samples at the highest efficiency. However, without carefully chosen simulation parameters for a specific problem, HMC generally suffers from simulation locality and computation waste. As a result, the NoUTurn Sampler (NUTS) has been proposed to automatically tune these parameters during simulation and is the current stateoftheart sampling algorithm. However, application of NUTS requires frequent gradient calculation of a given distribution and highvolume vector processing, especially for largescale problems, leading to drawing an expensively large number of samples and a desire of hardware acceleration. While some hardware acceleration works have been proposed for traditional Markov Chain Monte Carlo (MCMC) and HMC methods, there is no existing work targeting hardware acceleration of the NUTS algorithm. In this paper, we present the first NUTS accelerator on FPGA while addressing the high complexity of this stateoftheart algorithm. Our hardware and algorithm cooptimizations include an incremental resampling technique which leads to more »
 Award ID(s):
 1741173
 Publication Date:
 NSFPAR ID:
 10250144
 Journal Name:
 The 32nd IEEE International Conference on Applicationspecific Systems, Architectures and Processors
 Sponsoring Org:
 National Science Foundation
More Like this


Stochastic Gradient Hamiltonian Monte Carlo (SGHMC) methods have been widely used to sample from certain probability distributions, incorporating (kernel) density derivatives and/or given datasets. Instead of exploring new samples from kernel spaces, this piece of work proposed a novel SGHMC sampler, namely Spectral Hamiltonian Monte Carlo (SpHMC), that produces the high dimensional sparse representations of given datasets through sparse sensing and SGHMC. Inspired by compressed sensing, we assume all given samples are lowdimensional measurements of certain highdimensional sparse vectors, while a continuous probability distribution exists in such highdimensional space. Specifically, given a dictionary for sparse coding, SpHMC first derives amore »

Stochastic gradient Langevin dynamics (SGLD) and stochastic gradient Hamiltonian Monte Carlo (SGHMC) are two popular Markov Chain Monte Carlo (MCMC) algorithms for Bayesian inference that can scale to large datasets, allowing to sample from the posterior distribution of the parameters of a statistical model given the input data and the prior distribution over the model parameters. However, these algorithms do not apply to the decentralized learning setting, when a network of agents are working collaboratively to learn the parameters of a statistical model without sharing their individual data due to privacy reasons or communication constraints. We study two algorithms: Decentralizedmore »

We give a algorithm for exact sampling from the Bingham distribution p(x)∝exp(x⊤Ax) on the sphere Sd−1 with expected runtime of poly(d,λmax(A)−λmin(A)). The algorithm is based on rejection sampling, where the proposal distribution is a polynomial approximation of the pdf, and can be sampled from by explicitly evaluating integrals of polynomials over the sphere. Our algorithm gives exact samples, assuming exact computation of an inverse function of a polynomial. This is in contrast with Markov Chain Monte Carlo algorithms, which are not known to enjoy rapid mixing on this problem, and only give approximate samples. As a direct application, we usemore »

Differential privacy has been widely adopted to release continuous and scalarvalued information on a database without compromising the privacy of individual data records in it. The problem of querying binary and matrixvalued information on a database in a differentially private manner has rarely been studied. However, binary and matrixvalued data are ubiquitous in realworld applications, whose privacy concerns may arise under a variety of circumstances. In this paper, we devise an exclusive or (XOR) mechanism that perturbs binary and matrixvalued query result by conducting an XOR operation on the query result with calibrated noises attributed to a matrixvalued Bernoulli distribution.more »