skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Modeling Users’ Curiosity in Recommender Systems
Today’s recommender systems are criticized for recommending items that are too obvious to arouse users’ interests. Therefore the research community has advocated some ”beyond accuracy” evaluation metrics such as novelty, diversity, and serendipity with the hope of promoting information discovery and sustaining users’ interests over a long period of time. While bringing in new perspectives, most of these evaluation metrics have not considered individual users’ differences in their capacity to experience those ”beyond accuracy” items. Open-minded users may embrace a wider range of recommendations than conservative users. In this paper, we proposed to use curiosity traits to capture such individual users’ differences. We developed a model to approximate an individual’s curiosity distribution over different stimulus levels. We used an item’s surprise level to estimate the stimulus level and whether such a level is in the range of the user’s appetite for stimulus, calledComfort Zone. We then proposed a recommender system framework that considers both user preference and theirComfort Zonewhere the curiosity is maximally aroused. Our framework differs from a typical recommender system in that it leverages human’sComfort Zonefor stimuli to promote engagement with the system. A series of evaluation experiments have been conducted to show that our framework is able to rank higher the items with not only high ratings but also high curiosity stimulation. The recommendation list generated by our algorithm has higher potential of inspiring user curiosity compared to the state-of-the-art deep learning approaches. The personalization factor for assessing the surprise stimulus levels further helps the recommender model achieve smaller (better) inter-user similarity.  more » « less
Award ID(s):
1910696
PAR ID:
10467124
Author(s) / Creator(s):
;
Publisher / Repository:
ACM
Date Published:
Journal Name:
ACM Transactions on Knowledge Discovery from Data
ISSN:
1556-4681
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Today’s recommender systems are criticized for recommending items that are too obvious to arouse users’ interest. That is why the recommender systems research community has advocated some ”beyond accuracy” evaluation metrics such as novelty, diversity, coverage, and serendipity with the hope of promoting information discovery and sustain users’ interest over a long period of time. While bringing in new perspectives, most of these evaluation metrics have not considered individual users’ difference: an open-minded user may favor highly novel or diversified recommendations whereas a conservative user’s appetite for novelty or diversity may not be that large. In this paper, we developed a model to approximate an individual’s curiosity distribution over different levels of stimuli guided by the well-known Wundt curve in Psychology. We measured an item’s surprise level to assess the stimulation level and whether it is in the range of the user’s appetite for stimulus. We then proposed a recommendation system framework that considers both user preference and appetite for stimulus where the curiosity is maximally aroused. Our framework differs from a typical recommender system in that it leverages human’s curiosity to promote intrinsic interest with the system. A series of evaluation experiments have been conducted to show that our framework is able to rank higher the items with not only high ratings but also high response likelihood. The recommendation list generated by our algorithm has higher potential of inspiring user curiosity compared to traditional approaches. The personalization factor for assessing the stimulus (surprise) strength further helps the recommender achieve smaller (better) inter-user similarity. 
    more » « less
  2. null (Ed.)
    The growing amount of online information today has increased opportunity to discover interesting and useful information. Various recommender systems have been designed to help people discover such information. No matter how accurately the recommender algorithms perform, users’ engagement with recommended results has been complained being less than ideal. In this study, we touched on two human-centered objectives for recommender systems: user satisfaction and curiosity, both of which are believed to play roles in maintaining user engagement and sustain such engagement in the long run. Specifically, we leveraged the concept of surprise and used an existing computational model of surprise to identify relevantly surprising health articles aiming at improving user satisfaction and inspiring their curiosity. We designed a user study to first test the validity of the surprise model in a health news recommender system, called LuckyFind. Then user satisfaction and curiosity were evaluated. We find that the computational surprise model helped identify surprising recommendations at little cost of user satisfaction. Users gave higher ratings on interestingness than usefulness for those surprising recommendations. Curiosity was inspired more for those individuals who have a larger capacity to experience curiosity. Over half of the users have changed their preferences after using LuckyFind, either discovering new areas, reinforcing their existing interests, or stopping following those they did not want anymore. The insights of the research will make researchers and practitioners rethink the objectives of today’s recommender systems as being more human-centered beyond algorithmic accuracy. 
    more » « less
  3. Despite the benefits of personalizing items and information tailored to users’ needs, it has been found that recommender systems tend to introduce biases that favor popular items or certain categories of items and dominant user groups. In this study, we aim to characterize the systematic errors of a recommendation system and how they manifest in various accountability issues, such as stereotypes, biases, and miscalibration. We propose a unified framework that distinguishes the sources of prediction errors into a set of key measures that quantify the various types of system-induced effects, at both the individual and collective levels. Based on our measuring framework, we examine the most widely adopted algorithms in the context of movie recommendation. Our research reveals three important findings: (1) Differences between algorithms: recommendations generated by simpler algorithms tend to be more stereotypical but less biased than those generated by more complex algorithms. (2) Disparate impact on groups and individuals: system-induced biases and stereotypes have a disproportionate effect on atypical users and minority groups (e.g., women and older users). (3) Mitigation opportunity: using structural equation modeling, we identify the interactions between user characteristics (typicality and diversity), system-induced effects, and miscalibration. We further investigate the possibility of mitigating system-induced effects by oversampling underrepresented groups and individuals, which was found to be effective in reducing stereotypes and improving recommendation quality. Our research is the first systematic examination of not only system-induced effects and miscalibration but also the stereotyping issue in recommender systems. 
    more » « less
  4. Recommender systems are usually designed by engineers, researchers, designers, and other members of development teams. These systems are then evaluated based on goals set by the aforementioned teams and other business units of the platforms operating the recommender systems. This design approach emphasizes the designers’ vision for how the system can best serve the interests of users, providers, businesses, and other stakeholders. Although designers may be well-informed about user needs through user experience and market research, they are still the arbiters of the system’s design and evaluation, with other stakeholders’ interests less emphasized in user-centered design and evaluation. When extended to recommender systems for social good, this approach results in systems that reflect the social objectives as envisioned by the designers and evaluated as the designers understand them. Instead, social goals and operationalizations should be developed through participatory and democratic processes that are accountable to their stakeholders. We argue that recommender systems aimed at improving social good should be designedbyandwith, not justfor, the people who will experience their benefits and harms. That is, they should be designed in collaboration with their users, creators, and other stakeholders as full co-designers, not only as user study participants. 
    more » « less
  5. Offline evaluation protocols for recommender systems are intended to estimate users' satisfaction with recommendations using static data from prior user interactions. These evaluations allow researchers and production developers to carry out first-pass estimates of the likely performance of a new system and weed out bad ideas before presenting them to users. However, offline evaluations cannot accurately assess novel, relevant recommendations, because the most novel recommendations items that were previously unknown to the user; such items are missing from the historical data, so they cannot be judged as relevant. A breakthrough that reliably produces novel, relevant recommendations would score poorly with current offline evaluation techniques. While the existence of this problem is noted in the literature, its extent is not well-understood. We present a simulation study to estimate the error that such missing data causes in commonly-used evaluation metrics in order to assess its prevalence and impact. We find that missing data in the rating or observation process causes the evaluation protocol to systematically mis-estimate metric values, and in some cases erroneously determine that a popularity-based recommender outperforms even a perfect personalized recommender. Substantial breakthroughs in recommendation quality, therefore, will be difficult to assess with existing offline techniques. 
    more » « less