Non-reversible Parallel Tempering for Uncertainty Approximation in Deep Learning

Deng, Wei; Zhang, Qian; Feng, Qi; Liang, Faming; Lin, Guang

Citation Details

Parallel tempering (PT), also known as replica exchange, is the go-to workhorse for simulations of multi-modal distributions. The key to the success of PT is to adopt efficient swap schemes. The popular deterministic even-odd (DEO) scheme exploits the non-reversibility property and has successfully reduced the communication cost from O(P 2) to O(P) given sufficient many P chains. However, such an innovation largely disappears in big data problems due to the limited chains and extremely few bias-corrected swaps. To handle this issue, we generalize the DEO scheme to promote the non-reversibility and obtain an appealing communication cost O(P log P) based on the optimal window size. In addition, we also analyze the bias when we adopt stochastic gradient descent (SGD) with large and constant learning rates as exploration kernels. Such a user-friendly nature enables us to conduct large-scale uncertainty approximation tasks without much tuning costs. more »

Award ID(s):: 2134209 2053746 1555072

NSF-PAR ID:: 10419664

Author(s) / Creator(s):: Deng, Wei; Zhang, Qian; Feng, Qi; Liang, Faming; Lin, Guang

Date Published:: 2023-02-07

Journal Name:: Thirty-seventh AAAI Conference on Artificial Intelligence

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this