Today’s recommender systems are criticized for recommending items that are too obvious to arouse users’ interests. Therefore the research community has advocated some ”beyond accuracy” evaluation metrics such as novelty, diversity, and serendipity with the hope of promoting information discovery and sustaining users’ interests over a long period of time. While bringing in new perspectives, most of these evaluation metrics have not considered individual users’ differences in their capacity to experience those ”beyond accuracy” items. Open-minded users may embrace a wider range of recommendations than conservative users. In this paper, we proposed to use curiosity traits to capture such individual users’ differences. We developed a model to approximate an individual’s curiosity distribution over different stimulus levels. We used an item’s surprise level to estimate the stimulus level and whether such a level is in the range of the user’s appetite for stimulus, called
- Award ID(s):
- 1910696
- PAR ID:
- 10188332
- Date Published:
- Journal Name:
- ArXivorg
- ISSN:
- 2331-8422
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Comfort Zone . We then proposed a recommender system framework that considers both user preference and theirComfort Zone where the curiosity is maximally aroused. Our framework differs from a typical recommender system in that it leverages human’sComfort Zone for stimuli to promote engagement with the system. A series of evaluation experiments have been conducted to show that our framework is able to rank higher the items with not only high ratings but also high curiosity stimulation. The recommendation list generated by our algorithm has higher potential of inspiring user curiosity compared to the state-of-the-art deep learning approaches. The personalization factor for assessing the surprise stimulus levels further helps the recommender model achieve smaller (better) inter-user similarity. -
null (Ed.)The growing amount of online information today has increased opportunity to discover interesting and useful information. Various recommender systems have been designed to help people discover such information. No matter how accurately the recommender algorithms perform, users’ engagement with recommended results has been complained being less than ideal. In this study, we touched on two human-centered objectives for recommender systems: user satisfaction and curiosity, both of which are believed to play roles in maintaining user engagement and sustain such engagement in the long run. Specifically, we leveraged the concept of surprise and used an existing computational model of surprise to identify relevantly surprising health articles aiming at improving user satisfaction and inspiring their curiosity. We designed a user study to first test the validity of the surprise model in a health news recommender system, called LuckyFind. Then user satisfaction and curiosity were evaluated. We find that the computational surprise model helped identify surprising recommendations at little cost of user satisfaction. Users gave higher ratings on interestingness than usefulness for those surprising recommendations. Curiosity was inspired more for those individuals who have a larger capacity to experience curiosity. Over half of the users have changed their preferences after using LuckyFind, either discovering new areas, reinforcing their existing interests, or stopping following those they did not want anymore. The insights of the research will make researchers and practitioners rethink the objectives of today’s recommender systems as being more human-centered beyond algorithmic accuracy.more » « less
-
Offline evaluation protocols for recommender systems are intended to estimate users' satisfaction with recommendations using static data from prior user interactions. These evaluations allow researchers and production developers to carry out first-pass estimates of the likely performance of a new system and weed out bad ideas before presenting them to users. However, offline evaluations cannot accurately assess novel, relevant recommendations, because the most novel recommendations items that were previously unknown to the user; such items are missing from the historical data, so they cannot be judged as relevant. A breakthrough that reliably produces novel, relevant recommendations would score poorly with current offline evaluation techniques. While the existence of this problem is noted in the literature, its extent is not well-understood. We present a simulation study to estimate the error that such missing data causes in commonly-used evaluation metrics in order to assess its prevalence and impact. We find that missing data in the rating or observation process causes the evaluation protocol to systematically mis-estimate metric values, and in some cases erroneously determine that a popularity-based recommender outperforms even a perfect personalized recommender. Substantial breakthroughs in recommendation quality, therefore, will be difficult to assess with existing offline techniques.more » « less
-
null (Ed.)Recently there has been a growing interest in fairness-aware recommender systems including fairness in providing consistent performance across different users or groups of users. A recommender system could be considered unfair if the recommendations do not fairly represent the tastes of a certain group of users while other groups receive recommendations that are consistent with their preferences. In this paper, we use a metric called miscalibration for measuring how a recommendation algorithm is responsive to users’ true preferences and we consider how various algorithms may result in different degrees of miscalibration for different users. In particular, we conjecture that popularity bias which is a well-known phenomenon in recommendation is one important factor leading to miscalibration in recommendation. Our experimental results using two real-world datasets show that there is a connection between how different user groups are affected by algorithmic popularity bias and their level of interest in popular items. Moreover, we show that the more a group is affected by the algorithmic popularity bias, the more their recommendations are miscalibrated.more » « less
-
Despite the benefits of personalizing items and information tailored to users’ needs, it has been found that recommender systems tend to introduce biases that favor popular items or certain categories of items and dominant user groups. In this study, we aim to characterize the systematic errors of a recommendation system and how they manifest in various accountability issues, such as stereotypes, biases, and miscalibration. We propose a unified framework that distinguishes the sources of prediction errors into a set of key measures that quantify the various types of system-induced effects, at both the individual and collective levels. Based on our measuring framework, we examine the most widely adopted algorithms in the context of movie recommendation. Our research reveals three important findings: (1) Differences between algorithms: recommendations generated by simpler algorithms tend to be more stereotypical but less biased than those generated by more complex algorithms. (2) Disparate impact on groups and individuals: system-induced biases and stereotypes have a disproportionate effect on atypical users and minority groups (e.g., women and older users). (3) Mitigation opportunity: using structural equation modeling, we identify the interactions between user characteristics (typicality and diversity), system-induced effects, and miscalibration. We further investigate the possibility of mitigating system-induced effects by oversampling underrepresented groups and individuals, which was found to be effective in reducing stereotypes and improving recommendation quality. Our research is the first systematic examination of not only system-induced effects and miscalibration but also the stereotyping issue in recommender systems.