skip to main content


Title: JNET: Learning User Representations via Joint Network Embedding and Topic Embedding
User representation learning is vital to capture diverse user preferences, while it is also challenging as user intents are latent and scattered among complex and different modalities of user-generated data, thus, not directly measurable. Inspired by the concept of user schema in social psychology, we take a new perspective to perform user representation learning by constructing a shared latent space to capture the dependency among different modalities of user-generated data. Both users and topics are embedded to the same space to encode users' social connections and text content, to facilitate joint modeling of different modalities, via a probabilistic generative framework. We evaluated the proposed solution on large collections of Yelp reviews and StackOverflow discussion posts, with their associated network structures. The proposed model outperformed several state-of-the-art topic modeling based user models with better predictive power in unseen documents, and state-of-the-art network embedding based user models with improved link prediction quality in unseen nodes. The learnt user representations are also proved to be useful in content recommendation, e.g., expert finding in StackOverflow.  more » « less
Award ID(s):
1718216 1553568
NSF-PAR ID:
10177159
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the 13th International Conference on Web Search and Data Mining
Page Range / eLocation ID:
205 to 213
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. User modeling is critical for understanding user intents, while it is also challenging as user intents are so diverse and not directly observable. Most existing works exploit specific types of behavior signals for user modeling, e.g., opinionated data or network structure; but the dependency among different types of user-generated data is neglected. We focus on self-consistence across multiple modalities of user-generated data to model user intents. A probabilistic generative model is developed to integrate two companion learning tasks of opinionated content modeling and social network structure modeling for users. Individual users are modeled as a mixture over the instances of paired learning tasks to realize their behavior heterogeneity, and the tasks are clustered by sharing a global prior distribution to capture the homogeneity among users. Extensive experimental evaluations on large collections of Amazon and Yelp reviews with social network structures confirm the effectiveness of the proposed solution. The learned user models are interpretable and predictive: they enable more accurate sentiment classification and item/friend recommendations than the corresponding baselines that only model a singular type of user behaviors. 
    more » « less
  2. Social recommendation has achieved great success in many domains including e-commerce and location-based social networks. Existing methods usually explore the user-item interactions or user-user connections to predict users’ preference behaviors. However, they usually learn both user and item representations in Euclidean space, which has large limitations for exploring the latent hierarchical property in the data. In this article, we study a novel problem of hyperbolic social recommendation, where we aim to learn the compact but strong representations for both users and items. Meanwhile, this work also addresses two critical domain-issues, which are under-explored. First, users often make trade-offs with multiple underlying aspect factors to make decisions during their interactions with items. Second, users generally build connections with others in terms of different aspects, which produces different influences with aspects in social network. To this end, we propose a novel graph neural network (GNN) framework with multiple aspect learning, namely, HyperSoRec. Specifically, we first embed all users, items, and aspects into hyperbolic space with superior representations to ensure their hierarchical properties. Then, we adapt a GNN with novel multi-aspect message-passing-receiving mechanism to capture different influences among users. Next, to characterize the multi-aspect interactions of users on items, we propose an adaptive hyperbolic metric learning method by introducing learnable interactive relations among different aspects. Finally, we utilize the hyperbolic translational distance to measure the plausibility in each user-item pair for recommendation. Experimental results on two public datasets clearly demonstrate that our HyperSoRec not only achieves significant improvement for recommendation performance but also shows better representation ability in hyperbolic space with strong robustness and reliability. 
    more » « less
  3. Rumor spreaders are increasingly utilizing multimedia content to attract the attention and trust of news consumers. Though quite a few rumor detection models have exploited the multi-modal data, they seldom consider the inconsistent semantics between images and texts, and rarely spot the inconsistency among the post contents and background knowledge. In addition, they commonly assume the completeness of multiple modalities and thus are incapable of handling handle missing modalities in real-life scenarios. Motivated by the intuition that rumors in social media are more likely to have inconsistent semantics, a novel Knowledge-guided Dual-consistency Network is proposed to detect rumors with multimedia contents. It uses two consistency detection subnetworks to capture the inconsistency at the cross-modal level and the content-knowledge level simultaneously. It also enables robust multi-modal representation learning under different missing visual modality conditions, using a special token to discriminate between posts with visual modality and posts without visual modality. Extensive experiments on three public real-world multimedia datasets demonstrate that our framework can outperform the state-of-the-art baselines under both complete and incomplete modality conditions. 
    more » « less
  4. Rumor spreaders are increasingly utilizing multimedia content to attract the attention and trust of news consumers. Though quite a few rumor detection models have exploited the multi-modal data, they seldom consider the inconsistent semantics between images and texts, and rarely spot the inconsistency among the post contents and background knowledge. In addition, they commonly assume the completeness of multiple modalities and thus are incapable of handling handle missing modalities in real-life scenarios. Motivated by the intuition that rumors in social media are more likely to have inconsistent semantics, a novel Knowledge-guided Dual-consistency Network is proposed to detect rumors with multimedia contents. It uses two consistency detection subnetworks to capture the inconsistency at the cross-modal level and the content-knowledge level simultaneously. It also enables robust multi-modal representation learning under different missing visual modality conditions, using a special token to discriminate between posts with visual modality and posts without visual modality. Extensive experiments on three public real-world multimedia datasets demonstrate that our framework can outperform the state- of-the-art baselines under both complete and incomplete modality conditions. 
    more » « less
  5. Directed graphs have been widely used in Community Question Answering services (CQAs) to model asymmetric relationships among different types of nodes in CQA graphs, e.g., question, answer, user. Asymmetric transitivity is an essential property of directed graphs, since it can play an important role in downstream graph inference and analysis. Question difficulty and user expertise follow the characteristic of asymmetric transitivity. Maintaining such properties, while reducing the graph to a lower dimensional vector embedding space, has been the focus of much recent research. In this paper, we tackle the challenge of directed graph embedding with asymmetric transitivity preservation and then leverage the proposed embedding method to solve a fundamental task in CQAs: how to appropriately route and assign newly posted questions to users with the suitable expertise and interest in CQAs. The technique incorporates graph hierarchy and reachability information naturally by relying on a nonlinear transformation that operates on the core reachability and implicit hierarchy within such graphs. Subsequently, the methodology levers a factorization-based approach to generate two embedding vectors for each node within the graph, to capture the asymmetric transitivity. Extensive experiments show that our framework consistently and significantly outperforms the state-of-the-art baselines on three diverse realworld tasks: link prediction, and question difficulty estimation and expert finding in online forums like Stack Exchange. Particularly, our framework can support inductive embedding learning for newly posted questions (unseen nodes during training), and therefore can properly route and assign these kinds of questions to experts in CQAs. 
    more » « less