NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Oracle-Efficient Pessimism: Offline Policy Optimization In Contextual Bandits

Wang, Lequn; Krishnamurthy, Akshay; Slivkins, Alex (May 2024, International Conference on Artificial Intelligence and Statistics (AISTATS))

Full Text Available
Policy-Gradient Training of Language Models for Ranking

Gao, Ge; Chang, Jonathan D; Cardie, Claire; Brantley, Kianté; Joachims, Thorsten (December 2023, NeurIPS Workshop on Foundation Models for Decision Making)

Full Text Available
Off-Policy Evaluation for Large Action Spaces via Conjunct Effect Modeling

Saito, Yuta; Ren, Qingyang; Joachims, Thorsten (July 2023, International Conference on Machine Learning (ICML))

Full Text Available
Variance-Minimizing Augmentation Logging for Counterfactual Evaluation in Contextual Bandits

https://doi.org/10.1145/3539597.3570452

Tucker, Aaron D.; Joachims, Thorsten (February 2023, ACM Conference on Web Search and Data Mining (WSDM))

Full Text Available
Bandits with Costly Reward Observations

Tucker, Aaron; Biddulph, Caleb; Wang, Claire; Joachims, Thorsten (January 2023, Conference on Uncertainty in Artificial Intelligence (UAI))

Full Text Available
CONSEQUENCES — Causality, Counterfactuals and Sequential Decision-Making for Recommender Systems

https://doi.org/10.1145/3523227.3547409

Jeunen, Olivier; Joachims, Thorsten; Oosterhuis, Harrie; Saito, Yuta; Vasile, Flavian (September 2022, ACM Conference on Recommender Systems)

Full Text Available
Counterfactual Evaluation and Learning for Interactive Systems: Foundations, Implementations, and Recent Advances

https://doi.org/10.1145/3534678.3542601

Saito, Yuta; Joachims, Thorsten (August 2022, ACM SIGKDD Conference on Knowledge Discovery and Data Mining)

Full Text Available
Recommendations as Treatments

https://doi.org/10.1609/aimag.v42i3.18141

Joachims, Thorsten; London, Ben; Su, Yi; Swaminathan, Adith; Wang, Lequn (January 2022, AI Magazine)

In recent years, a new line of research has taken an interventional view of recommender systems, where recommendations are viewed as actions that the system takes to have a desired effect. This interventional view has led to the development of counterfactual inference techniques for evaluating and optimizing recommendation policies. This article explains how these techniques enable unbiased offline evaluation and learning despite biased data, and how they can inform considerations of fairness and equity in recommender systems.
more » « less
Full Text Available
Off-Policy Evaluation for Large Action Spaces via Embeddings

Saito, Yuta; Joachims, Thorsten (January 2022, International Conference on Machine Learning (ICML))

Full Text Available
Counterfactual Learning and Evaluation for Recommender Systems: Foundations, Implementations, and Recent Advances

https://doi.org/10.1145/3460231.3473320

Saito, Yuta; Joachims, Thorsten (September 2021, ACM Conference on Recommender Systems)

Counterfactual estimators enable the use of existing log data to estimate how some new target recommendation policy would have performed, if it had been used instead of the policy that logged the data. We say that those estimators work ”off-policy”, since the policy that logged the data is different from the target policy. In this way, counterfactual estimators enable Off-policy Evaluation (OPE) akin to an unbiased offline A/B test, as well as learning new recommendation policies through Off-policy Learning (OPL). The goal of this tutorial is to summarize Foundations, Implementations, and Recent Advances of OPE/OPL. Specifically, we will introduce the fundamentals of OPE/OPL and provide theoretical and empirical comparisons of conventional methods. Then, we will cover emerging practical challenges such as how to take into account combinatorial actions, distributional shift, fairness of exposure, and two-sided market structures. We will then present Open Bandit Pipeline, an open-source package for OPE/OPL, and how it can be used for both research and practical purposes. We will conclude the tutorial by presenting real-world case studies and future directions.
more » « less
Full Text Available

« Prev Next »

Search for: All records