skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, May 17 until 8:00 AM ET on Saturday, May 18 due to maintenance. We apologize for the inconvenience.

Title: Exploring author gender in book rating and recommendation
Collaborative filtering algorithms find useful patterns in rating and consumption data and exploit these patterns to guide users to good items. Many of these patterns reflect important real-world phenomena driving interactions between the various users and items; other patterns may be irrelevant or reflect undesired discrimination, such as discrimination in publishing or purchasing against authors who are women or ethnic minorities. In this work, we examine the response of collaborative filtering recommender algorithms to the distribution of their input data with respect to one dimension of social concern, namely content creator gender. Using publicly available book ratings data, we measure the distribution of the genders of the authors of books in user rating profiles and recommendation lists produced from this data. We find that common collaborative filtering algorithms tend to propagate at least some of each user’s tendency to rate or read male or female authors into their resulting recommendations, although they differ in both the strength of this propagation and the variance in the gender balance of the recommendation lists they produce. The data, experimental design, and statistical methods are designed to be reusable for studying potentially discriminatory social dimensions of recommendations in other domains and settings as well.  more » « less
Award ID(s):
Author(s) / Creator(s):
Date Published:
Journal Name:
User Modeling and User-Adapted Interaction
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper, we consider the Collaborative Ranking (CR) problem for recommendation systems. Given a set of pairwise preferences between items for each user, collaborative ranking can be used to rank un-rated items for each user, and this ranking can be naturally used for recommendation. It is observed that collaborative ranking algorithms usually achieve better performance since they directly minimize the ranking loss; however, they are rarely used in practice due to the poor scalability. All the existing CR algorithms have time complexity at least O(|Ω|r) per iteration, where r is the target rank and |Ω| is number of pairs which grows quadratically with number of ratings per user. For example, the Netflix data contains totally 20 billion rating pairs, and at this scale all the current algorithms have to work with significant subsampling, resulting in poor prediction on testing data. In this paper, we propose a new collaborative ranking algorithm called Primal-CR that reduces the time complexity toO(|Ω|+d1d2r), where d1 is number of users and d2 is the averaged number of items rated by a user. Note that d1, d2 is strictly smaller and open much smaller than |Ω|. Furthermore, by exploiting the fact that most data is in the form of numerical ratings instead of pairwise comparisons, we propose Primal-CR++ with O(d1d2(r + log d2)) time complexity. Both algorithms have better theoretical time complexity than existing approaches and also outperform existing approaches in terms of NDCG and pairwise error on real data sets. To the best of our knowledge, this is the first collaborative ranking algorithm capable of working on the full Netflix dataset using all the 20 billion rating pairs, and this leads to a model with much better recommendation compared with previous models trained on subsamples. Finally, compared with classical matrix factorization algorithm which also requires O(d1 d2r) time, our algorithm has almost the same efficiency while making much better recommendations since we consider the ranking loss. 
    more » « less
  2. null (Ed.)
    Item-based models are among the most popular collaborative filtering approaches for building recommender systems. Random walks can provide a powerful tool for harvesting the rich network of interactions captured within these models. They can exploit indirect relations between the items, mitigate the effects of sparsity, ensure wider itemspace coverage, as well as increase the diversity of recommendation lists. Their potential however, can be hindered by the tendency of the walks to rapidly concentrate towards the central nodes of the graph, thereby significantly restricting the range of K -step distributions that can be exploited for personalized recommendations. In this work, we introduce RecWalk ; a novel random walk-based method that leverages the spectral properties of nearly uncoupled Markov chains to provably lift this limitation and prolong the influence of users’ past preferences on the successive steps of the walk—thereby allowing the walker to explore the underlying network more fruitfully. A comprehensive set of experiments on real-world datasets verify the theoretically predicted properties of the proposed approach and indicate that they are directly linked to significant improvements in top- n recommendation accuracy. They also highlight RecWalk’s potential in providing a framework for boosting the performance of item-based models. RecWalk achieves state-of-the-art top- n recommendation quality outperforming several competing approaches, including recently proposed methods that rely on deep neural networks. 
    more » « less
  3. Large-scale online recommendation systems must facilitate the allocation of a limited number of items among competing users while learning their preferences from user feedback. As a principled way of incorporating market constraints and user incentives in the design, we consider our objectives to be two-fold: maximal social welfare with minimal instability. To maximize social welfare, our proposed framework enhances the quality of recommendations by exploring allocations that optimistically maximize the rewards. To minimize instability, a measure of users' incentives to deviate from recommended allocations, the algorithm prices the items based on a scheme derived from the Walrasian equilibria. Though it is known that these equilibria yield stable prices for markets with known user preferences, our approach accounts for the inherent uncertainty in the preferences and further ensures that the users accept their recommendations under offered prices. To the best of our knowledge, our approach is the first to integrate techniques from combinatorial bandits, optimal resource allocation, and collaborative filtering to obtain an algorithm that achieves sub-linear social welfare regret as well as sub-linear instability. Empirical studies on synthetic and real-world data also demonstrate the efficacy of our strategy compared to approaches that do not fully incorporate all these aspects. 
    more » « less
  4. Social recommendation task aims to predict users' preferences over items with the incorporation of social connections among users, so as to alleviate the sparse issue of collaborative filtering. While many recent efforts show the effectiveness of neural network-based social recommender systems, several important challenges have not been well addressed yet: (i) The majority of models only consider users’ social connections, while ignoring the inter-dependent knowledge across items; (ii) Most of existing solutions are designed for singular type of user-item interactions, making them infeasible to capture the interaction heterogeneity; (iii) The dynamic nature of user-item interactions has been less explored in many social-aware recommendation techniques. To tackle the above challenges, this work proposes a Knowledge-aware Coupled Graph Neural Network (KCGN) that jointly injects the inter-dependent knowledge across items and users into the recommendation framework. KCGN enables the high-order user- and item-wise relation encoding by exploiting the mutual information for global graph structure awareness. Additionally, we further augment KCGN with the capability of capturing dynamic multi-typed user-item interactive patterns. Experimental studies on real-world datasets show the effectiveness of our method against many strong baselines in a variety of settings. Source codes are available at: 
    more » « less
  5. null (Ed.)
    Collaborative filtering has been widely used in recommender systems. Existing work has primarily focused on improving the prediction accuracy mainly via either building refined models or incorporating additional side information, yet has largely ignored the inherent distribution of the input rating data. In this paper, we propose a data debugging framework to identify overly personalized ratings whose existence degrades the performance of a given collaborative filtering model. The key idea of the proposed approach is to search for a small set of ratings whose editing (e.g., modification or deletion) would near-optimally improve the recommendation accuracy of a validation set. Experimental results demonstrate that the proposed approach can significantly improve the recommendation accuracy. Furthermore, we observe that the identified ratings significantly deviate from the average ratings of the corresponding items, and the proposed approach tends to modify them towards the average. This result sheds light on the design of future recommender systems in terms of balancing between the overall accuracy and personalization. 
    more » « less