skip to main content

Title: LiRa: A New Likelihood-Based Similarity Score For Collaborative Filtering
Recommender system data presents unique challenges to the data mining, machine learning, and algorithms communities. The high missing data rate, in combination with the large scale and high dimensionality that is typical of recommender systems data, requires new tools and methods for efficient data analysis. Here, we address the challenge of evaluating similarity between two users in a recommender system, where for each user only a small set of ratings is available. We present a new similarity score, that we call LiRa, based on a statistical model of user similarity, for large-scale, discrete valued data with many missing values. We show that this score, based on a ratio of likelihoods, is more effective at identifying similar users than traditional similarity scores in user-based collaborative filtering, such as the Pearson correlation coefficient. We argue that our approach has significant potential to improve both accuracy and scalability in collaborative filtering.
Authors:
; ; ; ;
Award ID(s):
1637564
Publication Date:
NSF-PAR ID:
10040900
Journal Name:
LSRS'16: RecSys Workshop on Large-Scale Recommender Systems (10th ACM Conference on Recommender Systems:)
Sponsoring Org:
National Science Foundation
More Like this
  1. Recommender systems predict users’ preferences over a large number of items by pooling similar information from other users and/or items in the presence of sparse observations. One major challenge is how to utilize user-item specific covariates and networks describing user-item interactions in a high-dimensional situation, for accurate personalized prediction. In this article, we propose a smooth neighborhood recommender in the framework of the latent factor models. A similarity kernel is utilized to borrow neighborhood information from continuous covariates over a user-item specific network, such as a user’s social network, where the grouping information defined by discrete covariates is also integrated through the network. Consequently, user-item specific information is built into the recommender to battle the ‘cold-start” issue in the absence of observations in collaborative and content- based filtering. Moreover, we utilize a “divide-and-conquer” version of the alternating least squares algorithm to achieve scalable computation, and establish asymptotic results for the proposed method, demonstrating that it achieves superior prediction accuracy. Finally, we illustrate that the proposed method improves substantially over its competitors in simulated examples and real benchmark data–Last.fm music data.
  2. Offline evaluation protocols for recommender systems are intended to estimate users' satisfaction with recommendations using static data from prior user interactions. These evaluations allow researchers and production developers to carry out first-pass estimates of the likely performance of a new system and weed out bad ideas before presenting them to users. However, offline evaluations cannot accurately assess novel, relevant recommendations, because the most novel recommendations items that were previously unknown to the user; such items are missing from the historical data, so they cannot be judged as relevant. A breakthrough that reliably produces novel, relevant recommendations would score poorly with current offline evaluation techniques. While the existence of this problem is noted in the literature, its extent is not well-understood. We present a simulation study to estimate the error that such missing data causes in commonly-used evaluation metrics in order to assess its prevalence and impact. We find that missing data in the rating or observation process causes the evaluation protocol to systematically mis-estimate metric values, and in some cases erroneously determine that a popularity-based recommender outperforms even a perfect personalized recommender. Substantial breakthroughs in recommendation quality, therefore, will be difficult to assess with existing offline techniques.
  3. Implicit feedback is widely used in collaborative filtering methods for recommendation. It is well known that implicit feedback contains a large number of values that are missing not at random (MNAR); and the missing data is a mixture of negative and unknown feedback, making it difficult to learn users’ negative preferences. Recent studies modeled exposure, a latent missingness variable which indicates whether an item is exposed to a user, to give each missing entry a confidence of being negative feedback. However, these studies use static models and ignore the information in temporal dependencies among items, which seems to be an essential underlying factor to subsequent missingness. To model and exploit the dynamics of missingness, we propose a latent variable named “user intent” to govern the temporal changes of item missingness, and a hidden Markov model to represent such a process. The resulting framework captures the dynamic item missingness and incorporate it into matrix factorization (MF) for recommendation. We also explore two types of constraints to achieve a more compact and interpretable representation of user intents. Experiments on real-world datasets demonstrate the superiority of our method against state-of-the-art recommender systems.
  4. Cross-domain collaborative filtering recommenders exploit data from other domains (e.g., movie ratings) to predict users’ interests in a different target domain (e.g., suggest music). Most current cross-domain recommenders focus on modeling user ratings but pay limited attention to user reviews. Additionally, due to the complexity of these recommender systems, they cannot provide any information to users to support user decisions. To address these challenges, we propose Deep Hybrid Cross Domain (DHCD) model, a cross-domain neural framework, that can simultaneously predict user ratings, and provide useful information to strengthen the suggestions and support user decision across multiple domains. Specifically, DHCD enhances the predicted ratings by jointly modeling two crucial facets of users’ product assessment: ratings and reviews. To support decisions, it models and provides natural review-like sentences across domains according to user interests and item features. This model is robust in integrating user rating and review information from more than two domains. Our extensive experiments show that DHCD can significantly outperform advanced baselines in rating predictions and review generation tasks. For rating prediction tasks, it outperforms cross-domain and single-domain collaborative filtering as well as hybrid recommender systems. Furthermore, our review generation experiments suggest an improved perplexity score and transfer of reviewmore »information in DHCD.« less
  5. The prevalence of mobile phones and wearable devices enables the passive capturing and modeling of human behavior at an unprecedented resolution and scale. Past research has demonstrated the capability of mobile sensing to model aspects of physical health, mental health, education, and work performance, etc. However, most of the algorithms and models proposed in previous work follow a one-size-fits-all (i.e., population modeling) approach that looks for common behaviors amongst all users, disregarding the fact that individuals can behave very differently, resulting in reduced model performance. Further, black-box models are often used that do not allow for interpretability and human behavior understanding. We present a new method to address the problems of personalized behavior classification and interpretability, and apply it to depression detection among college students. Inspired by the idea of collaborative-filtering, our method is a type of memory-based learning algorithm. It leverages the relevance of mobile-sensed behavior features among individuals to calculate personalized relevance weights, which are used to impute missing data and select features according to a specific modeling goal (e.g., whether the student has depressive symptoms) in different time epochs, i.e., times of the day and days of the week. It then compiles features from epochs using majoritymore »voting to obtain the final prediction. We apply our algorithm on a depression detection dataset collected from first-year college students with low data-missing rates and show that our method outperforms the state-of-the-art machine learning model by 5.1% in accuracy and 5.5% in F1 score. We further verify the pipeline-level generalizability of our approach by achieving similar results on a second dataset, with an average improvement of 3.4% across performance metrics. Beyond achieving better classification performance, our novel approach is further able to generate personalized interpretations of the models for each individual. These interpretations are supported by existing depression-related literature and can potentially inspire automated and personalized depression intervention design in the future« less