skip to main content


Title: TRACE: Travel Reinforcement Recommendation Based on Location-Aware Context Extraction
As the popularity of online travel platforms increases, users tend to make ad-hoc decisions on places to visit rather than preparing the detailed tour plans in advance. Under the situation of timeliness and uncertainty of users’ demand, how to integrate real-time context into a dynamic and personalized recommendations have become a key issue in travel recommender system. In this paper, by integrating the users’ historical preferences and real-time context, a location-aware recommender system called TRACE (Travel Reinforcement Recommendations Based on Location-Aware Context Extraction) is proposed. It captures users’ features based on location-aware context learning model, and makes dynamic recommendations based on reinforcement learning. Specifically, this research: (1) designs a travel reinforcing recommender system based on an Actor-Critic framework, which can dynamically track the user preference shifts and optimize the recommender system performance; (2) proposes a location-aware context learning model, which aims at extracting user context from real-time location and then calculating the impacts of nearby attractions on users’ preferences; and (3) conducts both offline and online experiments. Our proposed model achieves the best performance in both of the two experiments, which demonstrates that tracking the users’ preference shifts based on real-time location is valuable for improving the recommendation results.  more » « less
Award ID(s):
1910696
NSF-PAR ID:
10298108
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
ACM transactions on knowledge discovery from data
ISSN:
1556-4681
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Collaborative bandit learning, i.e., bandit algorithms that utilize collaborative filtering techniques to improve sample efficiency in online interactive recommendation, has attracted much research attention as it enjoys the best of both worlds. However, all existing collaborative bandit learning solutions impose a stationary assumption about the environment, i.e., both user preferences and the dependency among users are assumed static over time. Unfortunately, this assumption hardly holds in practice due to users' ever-changing interests and dependency relations, which inevitably costs a recommender system sub-optimal performance in practice. In this work, we develop a collaborative dynamic bandit solution to handle a changing environment for recommendation. We explicitly model the underlying changes in both user preferences and their dependency relation as a stochastic process. Individual user's preference is modeled by a mixture of globally shared contextual bandit models with a Dirichlet process prior. Collaboration among users is thus achieved via Bayesian inference over the global bandit models. To balance exploitation and exploration during the interactions, Thompson sampling is used for both model selection and arm selection. Our solution is proved to maintain a standard $\tilde O(\sqrt{T})$ Bayesian regret in this challenging environment. Extensive empirical evaluations on both synthetic and real-world datasets further confirmed the necessity of modeling a changing environment and our algorithm's practical advantages against several state-of-the-art online learning solutions. 
    more » « less
  2. As widely used in data-driven decision-making, recommender systems have been recognized for their capabilities to provide users with personalized services in many user-oriented online services, such as E-commerce (e.g., Amazon, Taobao, etc.) and Social Media sites (e.g., Facebook and Twitter). Recent works have shown that deep neural networks-based recommender systems are highly vulnerable to adversarial attacks, where adversaries can inject carefully crafted fake user profiles (i.e., a set of items that fake users have interacted with) into a target recommender system to promote or demote a set of target items. Instead of generating users with fake profiles from scratch, in this paper, we introduce a novel strategy to obtain “fake” user profiles via copying cross-domain user profiles, where a reinforcement learning-based black-box attacking framework (CopyAttack+) is developed to effectively and efficiently select cross-domain user profiles from the source domain to attack the target system. Moreover, we propose to train a local surrogate system for mimicking adversarial black-box attacks in the source domain, so as to provide transferable signals with the purpose of enhancing the attacking strategy in the target black-box recommender system. Comprehensive experiments on three real-world datasets are conducted to demonstrate the effectiveness of the proposed attacking framework. 
    more » « less
  3. Context has been recognized as an important factor to consider in personalized recommender systems. Particularly in location-based services (LBSs), a fundamental task is to recommend to a mobile user where he/she could be interested to visit next at the right time. Additionally, location-based social networks (LBSNs) allow users to share location-embedded information with friends who often co-occur in the same or nearby points-of-interest (POIs) or share similar POI visiting histories, due to the social homophily theory and Tobler’s first law of geography. So, both the time information and LBSN friendship relations should be utilized for POI recommendation. Tensor completion has recently gained some attention in time-aware recommender systems. The problem decomposes a user-item-time tensor into low-rank embedding matrices of users, items and times using its observed entries, so that the underlying low-rank subspace structure can be tracked to fill the missing entries for time-aware recommendation. However, these tensor completion methods ignore the social-spatial context information available in LBSNs, which is important for POI recommendation since people tend to share their preferences with their friends, and near things are more related than distant things. In this paper, we utilize the side information of social networks and POI locations to enhance the tensor completion model paradigm for more effective time-aware POI recommendation. Specifically, we propose a regularization loss head based on a novel social Hausdorff distance function to optimize the reconstructed tensor. We also quantify the popularity of different POIs with location entropy to prevent very popular POIs from being over-represented hence suppressing the appearance of other more diverse POIs. To address the sensitivity of negative sampling, we train the model on the whole data by treating all unlabeled entries in the observed tensor as negative, and rewriting the loss function in a smart way to reduce the computational cost. Through extensive experiments on real datasets, we demonstrate the superiority of our model over state-of-the-art tensor completion methods. 
    more » « less
  4. Online dating platforms have gained widespread popularity as a means for individuals to seek potential romantic relationships. While recommender systems have been designed to improve the user experience in dating platforms by providing personalized recommendations, increasing concerns about fairness have encouraged the development of fairness-aware recommender systems from various perspectives (e.g., gender and race). However, sexual orientation, which plays a significant role in finding a satisfying relationship, is under-investigated. To fill this crucial gap, we propose a novel metric, Opposite Gender Interaction Ratio (OGIR), as a way to investigate potential unfairness for users with varying preferences towards the opposite gender. We empirically analyze a real online dating dataset and observe existing recommender algorithms could suffer from group unfairness according to OGIR. We further investigate the potential causes for such gaps in recommendation quality, which lead to the challenges of group quantity imbalance and group calibration imbalance. Ultimately, we propose a fair recommender system based on re-weighting and re-ranking strategies to respectively mitigate these associated imbalance challenges. Experimental results demonstrate both strategies improve fairness while their combination achieves the best performance towards maintaining model utility while improving fairness. 
    more » « less
  5. In this work, we propose to improve long-term user engagement in a recommender system from the perspective of sequential decision optimization, where users' click and return behaviors are directly modeled for online optimization. A bandit-based solution is formulated to balance three competing factors during online learning, including exploitation for immediate click, exploitation for expected future clicks, and exploration of unknowns for model estimation. We rigorously prove that with a high probability our proposed solution achieves a sublinear upper regret bound in maximizing cumulative clicks from a population of users in a given period of time, while a linear regret is inevitable if a user's temporal return behavior is not considered when making the recommendations. Extensive experimentation on both simulations and a large-scale real-world dataset collected from Yahoo frontpage news recommendation log verified the effectiveness and significant improvement of our proposed algorithm compared with several state-of-the-art online learning baselines for recommendation. 
    more » « less