The goal of the crime forecasting problem is to predict different types of crimes for each geographical region (like a neighborhood or censor tract) in the near future. Since nearby regions usually have similar socioeconomic characteristics which indicate similar crime patterns, recent state-of-the-art solutions constructed a distance-based region graph and utilized Graph Neural Network (GNN) techniques for crime forecasting, because the GNN techniques could effectively exploit the latent relationships between neighboring region nodes in the graph if the edges reveal high dependency or correlation. However, this distance-based pre-defined graph can not fully capture crime correlation between regions that are far from each other but share similar crime patterns. Hence, to make a more accurate crime prediction, the main challenge is to learn a better graph that reveals the dependencies between regions in crime occurrences and meanwhile captures the temporal patterns from historical crime records. To address these challenges, we propose an end-to-end graph convolutional recurrent network called HAGEN with several novel designs for crime prediction. Specifically, our framework could jointly capture the crime correlation between regions and the temporal crime dynamics by combining an adaptive region graph learning module with the Diffusion Convolution Gated Recurrent Unit (DCGRU). Based on the homophily assumption of GNN (i.e., graph convolution works better where neighboring nodes share the same label), we propose a homophily-aware constraint to regularize the optimization of the region graph so that neighboring region nodes on the learned graph share similar crime patterns, thus fitting the mechanism of diffusion convolution. Empirical experiments and comprehensive analysis on two real-world datasets showcase the effectiveness of HAGEN.
more »
« less
Inf-VAE: A Variational Autoencoder Framework to Integrate Homophily and Influence in Diffusion Prediction
Recent years have witnessed tremendous interest in understanding and predicting information spread on social media platforms such as Twitter, Facebook, etc. Existing diffusion prediction methods primarily exploit the sequential order of influenced users by projecting diffusion cascades onto their local social neighborhoods. However, this fails to capture global social structures that do not explicitly manifest in any of the cascades, resulting in poor performance for inactive users with limited historical activities. In this paper, we present a novel variational autoencoder framework (Inf-VAE) to jointly embed homophily and influence through proximity-preserving social and position-encoded temporal latent variables. To model social homophily, Inf-VAE utilizes powerful graph neural network architectures to learn social variables that selectively exploit the social connections of users. Given a sequence of seed user activations, Inf-VAE uses a novel expressive co-attentive fusion network that jointly attends over their social and temporal variables to predict the set of all influenced users. Our experimental results on multiple real-world social network datasets, including Digg, Weibo, and Stack-Exchanges demonstrate significant gains (22% MAP@10) for Inf-VAE over state-of-the-art diffusion prediction models; we achieve massive gains for users with sparse activities, and users who lack direct social neighbors in seed sets.
more »
« less
- PAR ID:
- 10160113
- Date Published:
- Journal Name:
- Proc. 2020 ACM Int. Conf. on Web Search and Data Mining (WSDM'20)
- Volume:
- 1
- Issue:
- 1
- Page Range / eLocation ID:
- 510 to 518
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
With the prevalence of interaction on social media, data compiled from these networks are perfect for analyzing social trends. One such trend that this paper aims to address is political homophily. Evidence of political homophily is well researched and indicates that people have a strong tendency to interact with others with similar political ideologies. Additionally, as links naturally form in a social network, either through recommendations or indirect interaction, new links are very likely to reinforce communities. This serves to make social media more insulated and ultimately more polarizing. We aim to address this problem by providing link recommendations that will reduce network homophily. We propose several variants of common neighbor-based link prediction algorithms that aim to recommend links to users who are similar but also would decrease homophily. We demonstrate that acceptance of these recommendations can indeed reduce the homophily of the network, whereas acceptance of link recommendations from a standard common neighbors algorithm does not.more » « less
-
Effectively predicting the size of an information cascade is critical for many applications spanning from identifying viral marketing and fake news to precise recommendation and online advertising. Traditional approaches either heavily depend on underlying diffusion models and are not optimized for popularity prediction, or use complicated hand-crafted features that cannot be easily generalized to different types of cascades. Recent generative approaches allow for understanding the spreading mechanisms, but with unsatisfactory prediction accuracy. To capture both the underlying structures governing the spread of information and inherent dependencies between re-tweeting behaviors of users, we propose a semi-supervised method, called Recurrent Cascades Convolutional Networks (CasCN), which explicitly models and predicts cascades through learning the latent representation of both structural and temporal information, without involving any other features. In contrast to the existing single, undirected and stationary Graph Convolutional Networks (GCNs), CasCN is a novel multi-directional/dynamic GCN. Our experiments conducted on real-world datasets show that CasCN significantly improves the prediction accuracy and reduces the computational cost compared to state-of-the-art approaches.more » « less
-
Decision-making on networks can be explained by both homophily and social influences. While homophily drives the formation of communities with similar characteristics, social influences occur both within and between communities. Social influences can be reasoned through role theory, which indicates that the influences among individuals depending on their roles and the behavior of interest. To operationalize these social science theories, we empirically identify the homophilous communities and use the community structures to capture such “roles”, affecting particular decision-making processes. We propose a generative model named the Stochastic Block influences Model and jointly analyzed both network formation and behavioral influences within and between different empirically-identified communities. To evaluate the performance and demonstrate the interpretability of our method, we study the adoption decisions for a microfinance product in Indian villages. We show that although individuals tend to form links within communities, there are strongly positive and negative social influences between communities, supporting the weak ties theory. Moreover, communities with shared characteristics are associated with positive influences. In contrast, communities that do not overlap are associated with negative influences. Our framework facilitates the quantification of the influences underlying decision communities and is thus a helpful tool for driving information diffusion, viral marketing, and technology adoption.more » « less
-
Influence maximization (IM) is the problem of identifying a limited number of initial influential users within a social network to maximize the number of influenced users. However, previous research has mostly focused on individual information propagation, neglecting the simultaneous and interactive dissemination of multiple information items. In reality, when users encounter a piece of information, such as a smartphone product, they often associate it with related products in their minds, such as earphones or computers from the same brand. Additionally, information platforms frequently recommend related content to users, amplifying this cascading effect and leading to multiplex influence diffusion.This paper first formulates the Multiplex Influence Maximization (Multi-IM) problem using multiplex diffusion models with an information association mechanism. In this problem, the seed set is a combination of influential users and information. To effectively manage the combinatorial complexity, we propose Graph Bayesian Optimization for Multi-IM (GBIM). The multiplex diffusion process is thoroughly investigated using a highly effective global kernelized attention message-passing module. This module, in conjunction with Bayesian linear regression (BLR), produces a scalable surrogate model. A data acquisition module incorporating the exploration-exploitation trade-off is developed to optimize the seed set further.Extensive experiments on synthetic and real-world datasets have proven our proposed framework effective. The code is available at https://github.com/zirui-yuan/GBIM.more » « less
An official website of the United States government

