In online recommendation, customers arrive in a sequential and stochastic manner from an underlying distribution and the online decision model recommends a chosen item for each arriving individual based on some strategy. We study how to recommend an item at each step to maximize the expected reward while achieving user-side fairness for customers, i.e., customers who share similar profiles will receive a similar reward regardless of their sensitive attributes and items being recommended. By incorporating causal inference into bandits and adopting soft intervention to model the arm selection strategy, we first propose the d-separation based UCB algorithm (D-UCB) to explore the utilization of the d-separation set in reducing the amount of exploration needed to achieve low cumulative regret. Based on that, we then propose the fair causal bandit (F-UCB) for achieving the counterfactual individual fairness. Both theoretical analysis and empirical evaluation demonstrate effectiveness of our algorithms.
more »
« less
REVEAL 2020: Bandit and Reinforcement Learning from User Interactions
The REVEAL workshop1 focuses on framing the recommendation problem as a one of making personalized interventions, e.g. deciding to recommend a particular item to a particular user. Moreover, these interventions sometimes depend on each other, where a stream of interactions occurs between the user and the system, and where each decision to recommend something will have an impact on future steps and long-term rewards. This framing creates a number of challenges we will discuss at the workshop. How can recommender systems be evaluated offline in such a context? How can we learn recommendation policies that are aware of these delayed consequences and outcomes?
more »
« less
- Award ID(s):
- 1901168
- PAR ID:
- 10309946
- Date Published:
- Journal Name:
- ACM Conference on Recommender Systems
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Most commercial music services rely on collaborative filtering to recommend artists and songs. While this method is effective for popular artists with large fanbases, it can present difficulties for recommending novel, lesser known artists due to a relative lack of user preference data. In this paper, we therefore seek to understand how content-based approaches can be used to more effectively recommend songs from these lesser known artists. Specifically, we conduct a user study to answer three questions. Firstly, do most users agree which songs are most acoustically similar? Secondly, is acoustic similarity a good proxy for how an individual might construct a playlist or recommend music to a friend? Thirdly, if so, can we find acoustic features that are related to human judgments of acoustic similarity? To answer these questions, our study asked 117 test subjects to compare two unknown candidate songs relative to a third known reference song. Our findings show that 1) judgments about acoustic similarity are fairly consistent, 2) acoustic similarity is highly correlated with playlist selection and recommendation, but not necessarily personal preference, and 3) we identify a subset of acoustic features from the Spotify Web API that is particularly predictive of human similarity judgments.more » « less
-
null (Ed.)Background Shared decision making requires evidence to be conveyed to the patient in a way they can easily understand and compare. Patient decision aids facilitate this process. This article reviews the current evidence for how to present numerical probabilities within patient decision aids. Methods Following the 2013 review method, we assembled a group of 9 international experts on risk communication across Australia, Germany, the Netherlands, the United Kingdom, and the United States. We expanded the topics covered in the first review to reflect emerging areas of research. Groups of 2 to 3 authors reviewed the relevant literature based on their expertise and wrote each section before review by the full authorship team. Results Of 10 topics identified, we present 5 fundamental issues in this article. Although some topics resulted in clear guidance (presenting the chance an event will occur, addressing numerical skills), other topics (context/evaluative labels, conveying uncertainty, risk over time) continue to have evolving knowledge bases. We recommend presenting numbers over a set time period with a clear denominator, using consistent formats between outcomes and interventions to enable unbiased comparisons, and interpreting the numbers for the reader to meet the needs of varying numeracy. Discussion Understanding how different numerical formats can bias risk perception will help decision aid developers communicate risks in a balanced, comprehensible manner and avoid accidental “nudging” toward a particular option. Decisions between probability formats need to consider the available evidence and user skills. The review may be useful for other areas of science communication in which unbiased presentation of probabilities is important.more » « less
-
Personalized recommendation based on multi-arm bandit (MAB) algorithms has shown to lead to high utility and efficiency as it can dynamically adapt the recommendation strategy based on feedback. However, unfairness could incur in personalized recommendation. In this paper, we study how to achieve user-side fairness in personalized recommendation. We formulate our fair personalized recommendation as a modified contextual bandit and focus on achieving fairness on the individual whom is being recommended an item as opposed to achieving fairness on the items that are being recommended. We introduce and define a metric that captures the fairness in terms of rewards received for both the privileged and protected groups. We develop a fair contextual bandit algorithm, Fair-LinUCB, that improves upon the traditional LinUCB algorithm to achieve group-level fairness of users. Our algorithm detects and monitors unfairness while it learns to recommend personalized videos to students to achieve high efficiency. We provide a theoretical regret analysis and show that our algorithm has a slightly higher regret bound than LinUCB. We conduct numerous experimental evaluations to compare the performances of our fair contextual bandit to that of LinUCB and show that our approach achieves group-level fairness while maintaining a high utility.more » « less
-
Explaining automatically generated recommendations allows users to make more informed and accurate decisions about which results to utilize, and therefore improves their satisfaction. In this work, we develop a multi-task learning solution for explainable recommendation. Two companion learning tasks of user preference modeling for recommendation and opinionated content modeling for explanation are integrated via a joint tensor factorization. As a result, the algorithm predicts not only a user's preference over a list of items, i.e., recommendation, but also how the user would appreciate a particular item at the feature level, i.e., opinionated textual explanation. Extensive experiments on two large collections of Amazon and Yelp reviews confirmed the effectiveness of our solution in both recommendation and explanation tasks, compared with several existing recommendation algorithms. And our extensive user study clearly demonstrates the practical value of the explainable recommendations generated by our algorithm.more » « less
An official website of the United States government

