A Theory of Selective Prediction
We consider a model of selective prediction, where the prediction algorithm is given a data sequence in an online fashion and asked to predict a pre-specified statistic of the upcoming data points. The algorithm is allowed to choose when to make the prediction as well as the length of the prediction window, possibly depending on the observations so far. We prove that, even without any distributional assumption on the input data stream, a large family of statistics can be estimated to non-trivial accuracy. To give one concrete example, suppose that we are given access to an arbitrary binary sequence x1, . . . , xn of length n. Our goal is to accurately predict the average observation, and we are allowed to choose the window over which the prediction is made: for some t < n and m < n − t, after seeing t observations we predict the average of x_{t+1},..., x{t+m}. This particular problem was first studied in [Dru13] and referred to as the “density prediction game”. We show that the expected squared error of our prediction can be bounded by O(1/log n) and prove a matching lower bound, which resolves an open question raised in [Dru13]. This result holds for any sequence (that is more »
Authors:
;
Award ID(s):
Publication Date:
NSF-PAR ID:
10098974
Journal Name:
Conference on Learning Theory (COLT)