Randomized multiarm bandits: An improved adaptive data collection method

Zhao, Zhigen  (ORCID:0000000299156448); Wang, Tong; Ji, Bo

doi:10.1002/sam.11681

Citation Details

Randomized multiarm bandits: An improved adaptive data collection method

Abstract In many scientific experiments, multiarmed bandits are used as an adaptive data collection method. However, this adaptive process can lead to a dependence that renders many commonly used statistical inference methods invalid. An example of this is the sample mean, which is a natural estimator of the mean parameter but can be biased. This can cause test statistics based on this estimator to have an inflated type I error rate, and the resulting confidence intervals may have significantly lower coverage probabilities than their nominal values. To address this issue, we propose an alternative approach called randomized multiarm bandits (rMAB). This combines a randomization step with a chosen MAB algorithm, and by selecting the randomization probability appropriately, optimal regret can be achieved asymptotically. Numerical evidence shows that the bias of the sample mean based on the rMAB is much smaller than that of other methods. The test statistic and confidence interval produced by this method also perform much better than its competitors. more »

Award ID(s):: 2311216

PAR ID:: 10499182

Author(s) / Creator(s):: Zhao, Zhigen ; Wang, Tong ; Ji, Bo

Publisher / Repository:: Wiley Blackwell (John Wiley & Sons)

Date Published:: 2024-04-06

Journal Name:: Statistical Analysis and Data Mining: The ASA Data Science Journal

Volume:: 17

Issue:: 2

ISSN:: 1932-1864

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Journal Article:
https://doi.org/10.1002/sam.11681

More Like this