Counterfactual learning from observational data involves learning a classifier on an entire population based on data that is observed conditioned on a selection policy. This work considers this problem in an active setting, where the learner additionally has access to unlabeled examples and can choose to get a subset of these labeled by an oracle. Prior work on this problem uses disagreement-based active learning, along with an importance weighted loss estimator to account for counterfactuals, which leads to a high label complexity. We show how to instead incorporate a more efficient counterfactual risk minimizer into the active learning algorithm. This requires us to modify both the counterfactual risk to make it amenable to active learning, as well as the active learning process to make it amenable to the risk. We provably demonstrate that the result of this is an algorithm which is statistically consistent as well as more label-efficient than prior work.
more »
« less
Active Learning with Logged Data
We consider active learning with logged data, where labeled examples are drawn conditioned on a predetermined logging policy, and the goal is to learn a classifier on the entire population, not just conditioned on the logging policy. Prior work addresses this problem either when only logged data is available, or purely in a controlled random experimentation setting where the logged data is ignored. In this work, we combine both approaches to provide an algorithm that uses logged data to bootstrap and inform experimentation, thus achieving the best of both worlds. Our work is inspired by a connection between controlled random experimentation and active learning, and modifies existing disagreement-based active learning algorithms to exploit logged data.
more »
« less
- Award ID(s):
- 1719133
- PAR ID:
- 10100915
- Date Published:
- Journal Name:
- Proceedings of Machine Learning Research
- Volume:
- 80
- Issue:
- 1
- ISSN:
- 2640-3498
- Page Range / eLocation ID:
- 5521-5530
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Anthropogenic disturbances are changing the structure and composition of tropical forests worldwide. Multiple disturbances often occur simultaneously in forests: for example, hunting and logging are within-forest disturbances that impact vast areas of seemingly intact rainforests. Despite recent work on the individual effects of these disturbances, our understanding of how they interact to influence tree communities is still limited. In northern Republic of Congo, we explored the effects of hunting and logging on tree communities. Over an 8-year period, we monitored 12,552 tree stems (≥ 10 cm diameter-at-breast height) spread over 30 1-ha plots along a gradient of human disturbance to compare the tree diversity between hunted and logged forest, once-logged forest, and protected forest free of both disturbances. Tree density, species richness, and community composition were affected by both hunting and logging. Forest close to human settlements was richer, more heterogenous, and more dynamic in species composition across censuses. In hunted and logged forest, fast-growing secondary species with low shade tolerance replaced old growth species. Comparatively, the once-logged forest had the greatest stem density and intermediate species richness with an increased density of shade-bearing species over time. Both tree species spatial turnover and tree recruitment were greatly affected by proximity to human settlements. A shift towards abiotically dispersed trees and increasing seed predation by rodents near villages can partly explain the differences in tree recruitment across the forest types. The combination of hunting and logging seems to have a greater impact on tree communities than either single disturbance, especially with nearness to villages.more » « less
-
The ability to perform offline A/B-testing and off-policy learning using logged contextual bandit feedback is highly desirable in a broad range of applications, including recommender systems, search engines, ad placement, and personalized health care. Both offline A/B-testing and offpolicy learning require a counterfactual estimator that evaluates how some new policy would have performed, if it had been used instead of the logging policy. In this paper, we present and analyze a family of counterfactual estimators which subsumes most estimators proposed to date. Most importantly, this analysis identifies a new estimator – called Continuous Adaptive Blending (CAB) – which enjoys many advantageous theoretical and practical properties. In particular, it can be substantially less biased than clipped Inverse Propensity Score (IPS) weighting and the Direct Method, and it can have less variance than Doubly Robust and IPS estimators. In addition, it is subdifferentiable such that it can be used for learning, unlike the SWITCH estimator. Experimental results show that CAB provides excellent evaluation accuracy and outperforms other counterfactual estimators in terms of learning performance.more » « less
-
Counterfactual estimators enable the use of existing log data to estimate how some new target recommendation policy would have performed, if it had been used instead of the policy that logged the data. We say that those estimators work ”off-policy”, since the policy that logged the data is different from the target policy. In this way, counterfactual estimators enable Off-policy Evaluation (OPE) akin to an unbiased offline A/B test, as well as learning new recommendation policies through Off-policy Learning (OPL). The goal of this tutorial is to summarize Foundations, Implementations, and Recent Advances of OPE/OPL. Specifically, we will introduce the fundamentals of OPE/OPL and provide theoretical and empirical comparisons of conventional methods. Then, we will cover emerging practical challenges such as how to take into account combinatorial actions, distributional shift, fairness of exposure, and two-sided market structures. We will then present Open Bandit Pipeline, an open-source package for OPE/OPL, and how it can be used for both research and practical purposes. We will conclude the tutorial by presenting real-world case studies and future directions.more » « less
-
Adversarial attacks against supervised learning a algorithms, which necessitates the application of logging while using supervised learning algorithms in software projects. Logging enables practitioners to conduct postmortem analysis, which can be helpful to diagnose any conducted attacks. We conduct an empirical study to identify and characterize log-related coding patterns, i.e., recurring coding patterns that can be leveraged to conduct adversarial attacks and needs to be logged. A list of log-related coding patterns can guide practitioners on what to log while using supervised learning algorithms in software projects. We apply qualitative analysis on 3,004 Python files used to implement 103 supervised learning-based software projects. We identify a list of 54 log-related coding patterns that map to six attacks related to supervised learning algorithms. Using Lo g Assistant to conduct P ostmortems for Su pervised L earning ( LOPSUL ) , we quantify the frequency of the identified log-related coding patterns with 278 open-source software projects that use supervised learning. We observe log-related coding patterns to appear for 22% of the analyzed files, where training data forensics is the most frequently occurring category.more » « less
An official website of the United States government

