Contextual bandit algorithms are increasingly replacing non-adaptive A/B tests in e-commerce, healthcare, and policymaking because they can both improve outcomes for study participants and increase the chance of identifying good or even best policies. To support credible inference on novel interventions at the end of the study, nonetheless, we still want to construct valid confidence intervals on average treatment effects, subgroup effects, or value of new policies. The adaptive nature of the data collected by contextual bandit algorithms, however, makes this difficult: standard estimators are no longer asymptotically normally distributed and classic confidence intervals fail to provide correct coverage. While this has been addressed in non-contextual settings by using stabilized estimators, variance stabilized estimators in the contextual setting pose unique challenges that we tackle for the first time in this paper. We propose the Contextual Adaptive Doubly Robust (CADR) estimator, a novel estimator for policy value that is asymptotically normal under contextual adaptive data collection. The main technical challenge in constructing CADR is designing adaptive and consistent conditional standard deviation estimators for stabilization. Extensive numerical experiments using 57 OpenML datasets demonstrate that confidence intervals based on CADR uniquely provide correct coverage.
more »
« less
This content will become publicly available on June 1, 2026
Nonstationary A/B Tests: Optimal Variance Reduction, Bias Correction, and Valid Inference
We develop an analytical framework to appropriately model and adequately analyze A/B tests in presence of nonparametric nonstationarities in the targeted business metrics. A/B tests, also known as online randomized controlled experiments, have been used at scale by data-driven enterprises to guide decisions and test innovative ideas to improve core business metrics. Meanwhile, nonstationarities, such as the time-of-day effect and the day-of-week effect, can often arise nonparametrically in key business metrics involving purchases, revenue, conversions, customer experiences, and so on. First, we develop a generic nonparametric stochastic model to capture nonstationarities in A/B test experiments, where each sample represents a visit or action associated with a time label. We build a practically relevant limiting regime to facilitate analyzing large-sample estimator performances under nonparametric nonstationarities. Second, we show that ignoring or inadequately addressing nonstationarities can cause standard A/B test estimators to have suboptimal variance and nonvanishing bias, therefore leading to loss of statistical efficiency and accuracy. We provide a new estimator that views time as a continuous strata and performs poststratification with a data-dependent number of stratification levels. Without making parametric assumptions, we prove a central limit theorem for the proposed estimator and show that the estimator attains the best achievable asymptotic variance and is asymptotically unbiased. Third, we propose a time-grouped randomization that is designed to balance treatment and control assignments at granular time scales. We show that when the time-grouped randomization is integrated to standard experimental designs to generate experiment data, simple A/B test estimators can achieve asymptotically optimal variance. A brief account of numerical experiments are conducted to illustrate the analysis. This paper was accepted by Baris Ata, stochastic models and simulation. Supplemental Material: The online appendices and data files are available at https://doi.org/10.1287/mnsc.2022.01205 .
more »
« less
- Award ID(s):
- 2220537
- PAR ID:
- 10614603
- Publisher / Repository:
- Management Science
- Date Published:
- Journal Name:
- Management Science
- Volume:
- 71
- Issue:
- 6
- ISSN:
- 0025-1909
- Page Range / eLocation ID:
- 4707 to 4727
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The weighted nearest neighbors (WNN) estimator has been popularly used as a flexible and easy-to-implement nonparametric tool for mean regression estimation. The bagging technique is an elegant way to form WNN estimators with weights automatically generated to the nearest neighbors (Steele, 2009; Biau et al., 2010); we name the resulting estimator as the distributional nearest neighbors (DNN) for easy reference. Yet, there is a lack of distributional results for such estimator, limiting its application to statistical inference. Moreover, when the mean regression function has higher-order smoothness, DNN does not achieve the optimal nonparametric convergence rate, mainly because of the bias issue. In this work, we provide an in-depth technical analysis of the DNN, based on which we suggest a bias reduction approach for the DNN estimator by linearly combining two DNN estimators with different subsampling scales, resulting in the novel two-scale DNN (TDNN) estimator. The two-scale DNN estimator has an equivalent representation of WNN with weights admitting explicit forms and some being negative. We prove that, thanks to the use of negative weights, the two-scale DNN estimator enjoys the optimal nonparametric rate of convergence in estimating the regression function under the fourth order smoothness condition. We further go beyond estimation and establish that the DNN and two-scale DNN are both asymptotically normal as the subsampling scales and sample size diverge to infinity. For the practical implementation, we also provide variance estimators and a distribution estimator using the jackknife and bootstrap techniques for the two-scale DNN. These estimators can be exploited for constructing valid confidence intervals for nonparametric inference of the regression function. The theoretical results and appealing nite-sample performance of the suggested two-scale DNN method are illustrated with several simulation examples and a real data application.more » « less
-
This paper studies inference in randomized controlled trials with covariate‐adaptive randomization when there are multiple treatments. More specifically, we study in this setting inference about the average effect of one or more treatments relative to other treatments or a control. As in Bugni, Canay, and Shaikh (2018), covariate‐adaptive randomization refers to randomization schemes that first stratify according to baseline covariates and then assign treatment status so as to achieve “balance” within each stratum. Importantly, in contrast to Bugni, Canay, and Shaikh (2018), we not only allow for multiple treatments, but further allow for the proportion of units being assigned to each of the treatments to vary across strata. We first study the properties of estimators derived from a “fully saturated” linear regression, that is, a linear regression of the outcome on all interactions between indicators for each of the treatments and indicators for each of the strata. We show that tests based on these estimators using the usual heteroskedasticity‐consistent estimator of the asymptotic variance are invalid in the sense that they may have limiting rejection probability under the null hypothesis strictly greater than the nominal level; on the other hand, tests based on these estimators and suitable estimators of the asymptotic variance that we provide are exact in the sense that they have limiting rejection probability under the null hypothesis equal to the nominal level. For the special case in which the target proportion of units being assigned to each of the treatments does not vary across strata, we additionally consider tests based on estimators derived from a linear regression with “strata fixed effects,” that is, a linear regression of the outcome on indicators for each of the treatments and indicators for each of the strata. We show that tests based on these estimators using the usual heteroskedasticity‐consistent estimator of the asymptotic variance are conservative in the sense that they have limiting rejection probability under the null hypothesis no greater than and typically strictly less than the nominal level, but tests based on these estimators and suitable estimators of the asymptotic variance that we provide are exact, thereby generalizing results in Bugni, Canay, and Shaikh (2018) for the case of a single treatment to multiple treatments. A simulation study and an empirical application illustrate the practical relevance of our theoretical results.more » « less
-
Nonparametric model-assisted estimators have been proposed to improve estimates of finite population parameters. Flexible nonparametric models provide more reliable estimators when a parametric model is misspecified. In this article, we propose an information criterion to select appropriate auxiliary variables to use in an additive model-assisted method. We approximate the additive nonparametric components using polynomial splines and extend the Bayesian Information Criterion (BIC) for finite populations. By removing irrelevant auxiliary variables, our method reduces model complexity and decreases estimator variance. We establish that the proposed BIC is asymptotically consistent in selecting the important explanatory variables when the true model is additive without interactions, a result supported by our numerical study. Our proposed method is easier to implement and better justified theoretically than the existing method proposed in the literature.more » « less
-
Nonparametric model-assisted estimators have been proposed to improve estimates of finite population parameters. Flexible nonparametric models provide more reliable estimators when a parametric model is misspecified. In this article, we propose an information criterion to select appropriate auxiliary variables to use in an additive model-assisted method. We approximate the additive nonparametric components using polynomial splines and extend the Bayesian Information Criterion (BIC) for finite populations. By removing irrelevant auxiliary variables, our method reduces model complexity and decreases estimator variance. We establish that the proposed BIC is asymptotically consistent in selecting the important explanatory variables when the true model is additive without interactions, a result supported by our numerical study. Our proposed method is easier to implement and better justified theoretically than the existing method proposed in the literature.more » « less
An official website of the United States government
