Point-of-Interest (POI) recommendation, pivotal for guiding users to their next interested locale, grapples with the persistent challenge of data sparsity. Whereas knowledge graphs (KGs) have emerged as a favored tool to mitigate the issue, existing KG-based methods tend to overlook two crucial elements: the intention steering users’ location choices and the high-order topological structure within the KG. In this paper, we craft an Intention-aware Knowledge Graph (IKG) that harmonizes users’ visit histories, movement trajectories, and location categories to model user intentions. Building upon IKG, our novel Intentionaware Knowledge Graph Network (IKGN) delves deeper into the POI recommendation by weighing and propagating node embeddings through an attention mechanism, capturing the unique locational intent of each user. A sequential model like GRU is then employed to ensure a comprehensive representation of users’ short- and long-term location preferences. An empirical study on two real-world datasets validates the effectiveness of our proposed IKGN, with it markedly outshining seven benchmark rival models in both Recall and NDCG metrics.
more »
« less
Contrastive Trajectory Learning for Tour Recommendation
The main objective of Personalized Tour Recommendation (PTR) is to generate a sequence of point-of-interest (POIs) for a particular tourist, according to the user-specific constraints such as duration time, start and end points, the number of attractions planned to visit, and so on. Previous PTR solutions are based on either heuristics for solving the orienteering problem to maximize a global reward with a specified budget or approaches attempting to learn user visiting preferences and transition patterns with the stochastic process or recurrent neural networks. However, existing learning methodologies rely on historical trips to train the model and use the next visited POI as the supervised signal, which may not fully capture the coherence of preferences and thus recommend similar trips to different users, primarily due to the data sparsity problem and long-tailed distribution of POI popularity. This work presents a novel tour recommendation model by distilling knowledge and supervision signals from the trips in a self-supervised manner. We propose Contrastive Trajectory Learning for Tour Recommendation (CTLTR), which utilizes the intrinsic POI dependencies and traveling intent to discover extra knowledge and augments the sparse data via pre-training auxiliary self-supervised objectives. CTLTR provides a principled way to characterize the inherent data correlations while tackling the implicit feedback and weak supervision problems by learning robust representations applicable for tour planning. We introduce a hierarchical recurrent encoder-decoder to identify tourists’ intentions and use the contrastive loss to discover subsequence semantics and their sequential patterns through maximizing the mutual information. Additionally, we observe that a data augmentation step as the preliminary of contrastive learning can solve the overfitting issue resulting from data sparsity. We conduct extensive experiments on a range of real-world datasets and demonstrate that our model can significantly improve the recommendation performance over the state-of-the-art baselines in terms of both recommendation accuracy and visiting orders.
more »
« less
- Award ID(s):
- 2030249
- PAR ID:
- 10403311
- Date Published:
- Journal Name:
- ACM Transactions on Intelligent Systems and Technology
- Volume:
- 13
- Issue:
- 1
- ISSN:
- 2157-6904
- Page Range / eLocation ID:
- 1 to 25
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Collaborative filtering (CF) is a prevalent technique utilized in recommender systems (RSs), and has been extensively deployed in various real-world applications. A recent study in CF focuses on improving the quality of representations from the perspective of alignment and uniformity on the hyperspheres for enhanced recommendation performance. It promotes alignment to increase the similarity between representations of interacting users and items, and enhances uniformity to have more uniformly distributed user and item representations within their respective hyperspheres. However, although alignment and uniformity are enforced by two different optimized objectives, respectively, they jointly constitute the supervised signals for model training. Models trained with only supervised signals in labeled data can inevitably overfit the noise introduced by label sampling variance, even with i.i.d. datasets. This overfitting to noise further compromises the model's generalizability and performance on unseen testing data. To address this issue, in this study, we aim to mitigate the effect caused by the sampling variance in labeled training data to improve representation generalizability from the perspective of alignment and uniformity. Representations with more generalized alignment and uniformity further lead to improved model performance on testing data. Specifically, we model the data as a user-item interaction bipartite graph, and apply a graph neural network (GNN) to learn the user and item representations. This graph modeling approach allows us to integrate self-supervised signals into the RS, by performing self-supervised contrastive learning on the user and item representations from the perspective of label-irrelevant alignment and uniformity. Since the representations are less dependent on label supervision, they can capture more label-irrelevant data structures and patterns, leading to more generalized alignment and uniformity. We conduct extensive experiments on three benchmark datasets to demonstrate the superiority of our framework (i.e., improved performance and faster convergence speed). Our codes: https://github.com/zyouyang/AUPlusmore » « less
-
Context has been recognized as an important factor to consider in personalized recommender systems. Particularly in location-based services (LBSs), a fundamental task is to recommend to a mobile user where he/she could be interested to visit next at the right time. Additionally, location-based social networks (LBSNs) allow users to share location-embedded information with friends who often co-occur in the same or nearby points-of-interest (POIs) or share similar POI visiting histories, due to the social homophily theory and Tobler’s first law of geography. So, both the time information and LBSN friendship relations should be utilized for POI recommendation. Tensor completion has recently gained some attention in time-aware recommender systems. The problem decomposes a user-item-time tensor into low-rank embedding matrices of users, items and times using its observed entries, so that the underlying low-rank subspace structure can be tracked to fill the missing entries for time-aware recommendation. However, these tensor completion methods ignore the social-spatial context information available in LBSNs, which is important for POI recommendation since people tend to share their preferences with their friends, and near things are more related than distant things. In this paper, we utilize the side information of social networks and POI locations to enhance the tensor completion model paradigm for more effective time-aware POI recommendation. Specifically, we propose a regularization loss head based on a novel social Hausdorff distance function to optimize the reconstructed tensor. We also quantify the popularity of different POIs with location entropy to prevent very popular POIs from being over-represented hence suppressing the appearance of other more diverse POIs. To address the sensitivity of negative sampling, we train the model on the whole data by treating all unlabeled entries in the observed tensor as negative, and rewriting the loss function in a smart way to reduce the computational cost. Through extensive experiments on real datasets, we demonstrate the superiority of our model over state-of-the-art tensor completion methods.more » « less
-
null (Ed.)Social relations are often used to improve recommendation quality when user-item interaction data is sparse in recommender systems. Most existing social recommendation models exploit pairwise relations to mine potential user preferences. However, real-life interactions among users are very complex and user relations can be high-order. Hypergraph provides a natural way to model high-order relations, while its potentials for improving social recommendation are under-explored. In this paper, we fill this gap and propose a multi-channel hypergraph convolutional network to enhance social recommendation by leveraging high-order user relations. Technically, each channel in the network encodes a hypergraph that depicts a common high-order user relation pattern via hypergraph convolution. By aggregating the embeddings learned through multiple channels, we obtain comprehensive user representations to generate recommendation results. However, the aggregation operation might also obscure the inherent characteristics of different types of high-order connectivity information. To compensate for the aggregating loss, we innovatively integrate self-supervised learning into the training of the hypergraph convolutional network to regain the connectivity information with hierarchical mutual information maximization. Extensive experiments on multiple real-world datasets demonstrate the superiority of the proposed model over the current SOTA methods, and the ablation study verifies the effectiveness and rationale of the multi-channel setting and the self-supervised task. The implementation of our model is available via https://github.com/Coder-Yu/RecQ.more » « less
-
Avidan, S. (Ed.)Despite the success of fully-supervised human skeleton sequence modeling, utilizing self-supervised pre-training for skeleton sequence representation learning has been an active field because acquiring task-specific skeleton annotations at large scales is difficult. Recent studies focus on learning video-level temporal and discriminative information using contrastive learning, but overlook the hierarchical spatial-temporal nature of human skeletons. Different from such superficial supervision at the video level, we propose a self-supervised hierarchical pre-training scheme incorporated into a hierarchical Transformer-based skeleton sequence encoder (Hi-TRS), to explicitly capture spatial, short-term, and long-term temporal dependencies at frame, clip, and video levels, respectively. To evaluate the proposed self-supervised pre-training scheme with Hi-TRS, we conduct extensive experiments covering three skeleton-based downstream tasks including action recognition, action detection, and motion prediction. Under both supervised and semi-supervised evaluation protocols, our method achieves the state-of-the-art performance. Additionally, we demonstrate that the prior knowledge learned by our model in the pre-training stage has strong transfer capability for different downstream tasks.more » « less