On Approximate Thompson Sampling with Langevin Algorithms

Mazumdar, Eric; Pacchiano, Aldo; Ma, Yian; Jordan, Michael; Bartlett, Peter

Citation Details

Thompson sampling for multi-armed bandit problems is known to enjoy favorable performance in both theory and practice. However, its wider deployment is restricted due to a significant computational limitation: the need for samples from posterior distributions at every iteration. In practice, this limitation is alleviated by making use of approximate sampling methods, yet provably incorporating approximate samples into Thompson Sampling algorithms remains an open problem. In this work we address this by proposing two efficient Langevin MCMC algorithms tailored to Thompson sampling. The resulting approximate Thompson Sampling algorithms are efficiently implementable and provably achieve optimal instance-dependent regret for the Multi-Armed Bandit (MAB) problem. To prove these results we derive novel posterior concentration bounds and MCMC convergence rates for log-concave distributions which may be of independent interest. more »

Award ID(s):: 1909365

PAR ID:: 10250955

Author(s) / Creator(s):: Mazumdar, Eric; Pacchiano, Aldo; Ma, Yian; Jordan, Michael; Bartlett, Peter

Editor(s):: Daumé III, Hal; Singh, Aarti

Date Published:: 2020-01-01

Journal Name:: Proceedings of the 37th International Conference on Machine Learning

Volume:: 119

Page Range / eLocation ID:: 6797-6807

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this