Using Big Data to Sharpen Design-Based Inference in A/B Tests

Sales, A.; Botelho, A. F.; Patikorn, T.; Heffernan, N. T.

Citation Details

Randomized A/B tests in educational software are not run in a vacuum: often, reams of historical data are available alongside the data from a randomized trial. This paper proposes a method to use this historical data–often high dimensional and longitudinal–to improve causal estimates from A/B tests. The method proceeds in two steps: first, fit a machine learning model to the historical data predicting students’ outcomes as a function of their covariates. Then, use that model to predict the outcomes of the randomized students in the A/B test. Finally, use design-based methods to estimate the treatment effect in the A/B test, using prediction errors in place of outcomes. This method retains all of the advantages of design-based inference, while, under certain conditions, yielding more precise estimators. This paper will give a theoretical condition under which the method improves statistical precision, and demonstrates it using a deep learning algorithm to help estimate effects in a set of experiments run inside ASSISTments. more »

Award ID(s):: 1724889

PAR ID:: 10095366

Author(s) / Creator(s):: Sales, A.; Botelho, A. F.; Patikorn, T.; Heffernan, N. T.

Date Published:: 2018-01-01

Journal Name:: Proceedings of the Eleventh International Conference on Educational Data Mining

Page Range / eLocation ID:: 479-485

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this