skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on September 1, 2026

Title: De-centering the (Traditional) user: Multistakeholder evaluation of recommender systems
Multistakeholder recommender systems are those that account for the impacts and preferences of multiple groups of individuals, not just the end users receiving recommendations. Due to their complexity, these systems cannot be evaluated strictly by the overall utility of a single stakeholder, as is often the case of more mainstream recommender system applications. In this article, we focus our discussion on the challenges of multistakeholder evaluation of recommender systems. We bring attention to the different aspects involved—from the range of stakeholders involved (including but not limited to providers and consumers) to the values and specific goals of each relevant stakeholder. We discuss how to move from theoretical principles to practical implementation, providing specific use case examples. Finally, we outline open research directions for the RecSys community to explore. We aim to provide guidance to researchers and practitioners about incorporating these complex and domain-dependent issues of evaluation in the course of designing, developing, and researching applications with multistakeholder aspects.  more » « less
Award ID(s):
2107577
PAR ID:
10634093
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Publisher / Repository:
Elsevier
Date Published:
Journal Name:
International Journal of Human-Computer Studies
Volume:
203
Issue:
C
ISSN:
1071-5819
Page Range / eLocation ID:
103560
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Current practice for evaluating recommender systems typically focuses on point estimates of user-oriented effectiveness metrics or business metrics, sometimes combined with additional metrics for considerations such as diversity and novelty. In this paper, we argue for the need for researchers and practitioners to attend more closely to various distributions that arise from a recommender system (or other information access system) and the sources of uncertainty that lead to these distributions. One immediate implication of our argument is that both researchers and practitioners must report and examine more thoroughly the distribution of utility between and within different stakeholder groups. However, distributions of various forms arise in many more aspects of the recommender systems experimental process, and distributional thinking has substantial ramifications for how we design, evaluate, and present recommender systems evaluation and research results. Leveraging and emphasizing distributions in the evaluation of recommender systems is a necessary step to ensure that the systems provide appropriate and equitably-distributed benefit to the people they affect. 
    more » « less
  2. Recommender systems traditionally find the most relevant products or services for users tailored to their needs or interests but they ignore the interests of the other sides of the market (aka stakeholders). In this paper, we propose to use a Ranked Bandit approach for an online multi-stakeholder recommender system that sequentially selects top 𝑘 items according to the relevance and priority of all the involved stakeholders. We presented three different criteria to consider the priority of each stakeholder when evaluating our approach. Our extensive experimental results on a movie dataset showed that the contextual multi-armed bandits with a relevance function make a higher level of satisfaction for all involved stakeholders in the long term. Keywords: Multi-stakeholder Recommender Systems; Multi-armed Bandits; Ranked Bandit; 
    more » « less
  3. Abstract Recommender systems are poised at the interface between stakeholders: for example, job applicants and employers in the case of recommendations of employment listings, or artists and listeners in the case of music recommendation. In such multisided platforms, recommender systems play a key role in enabling discovery of products and information at large scales. However, as they have become more and more pervasive in society, the equitable distribution of their benefits and harms have been increasingly under scrutiny, as is the case with machine learning generally. While recommender systems can exhibit many of the biases encountered in other machine learning settings, the intersection of personalization and multisidedness makes the question of fairness in recommender systems manifest itself quite differently. In this article, we discuss recent work in the area of multisided fairness in recommendation, starting with a brief introduction to core ideas in algorithmic fairness and multistakeholder recommendation. We describe techniques for measuring fairness and algorithmic approaches for enhancing fairness in recommendation outputs. We also discuss feedback and popularity effects that can lead to unfair recommendation outcomes. Finally, we introduce several promising directions for future research in this area. 
    more » « less
  4. Collaborative filtering (CF) methods are making an impact on our daily lives in a wide range of applications, including recommender systems and personalization. Latent factor methods, e.g., matrix factorization (MF), have been the state-of-the-art in CF, however they lack interpretability and do not provide a straightforward explanation for their predictions. Explainability is gaining momentum in recommender systems for accountability, and because a good explanation can swing an undecided user. Most recent explainable recommendation methods require auxiliary data such as review text or item content on top of item ratings. In this paper, we address the case where no additional data are available and propose augmenting the classical MF framework for CF with a prior that encodes each user's embedding as a sparse linear combination of item embeddings, and vice versa for each item embedding. Our XPL-CF approach automatically reveals these user-item relationships, which underpin the latent factors and explain how the resulting recommendations are formed. We showcase the effectiveness of XPL-CF on real data from various application domains. We also evaluate the explainability of the user-item relationship obtained from XPL-CF through numeric evaluation and case study examples. 
    more » « less
  5. Traditional offline evaluations of recommender systems apply metrics from machine learning and information retrieval in settings where their underlying assumptions no longer hold. This results in significant error and bias in measures of top-N recommendation performance, such as precision, recall, and nDCG. Several of the specific causes of these errors, including popularity bias and misclassified decoy items, are well-explored in the existing literature. In this paper we survey a range of work on identifying and addressing these problems, and report on our work in progress to simulate the recommender data generation and evaluation processes to quantify the extent of evaluation metric errors and assess their sensitivity to various assumptions. 
    more » « less