skip to main content


Title: CHIP: A Hawkes process model for continuous-time networks with scalable and consistent estimation
In many application settings involving networks, such as messages between users of an on-line social network or transactions between traders in financial markets, the observed data consist of timestamped relational events, which form a continuous-time network. We propose the Community Hawkes Independent Pairs (CHIP) generative model for such networks. We show that applying spectral clustering to an aggregated adjacency matrix constructed from the CHIP model provides consistent community detection for a growing number of nodes and time duration. We also develop consistent and computationally efficient estimators for the model parameters. We demonstrate that our proposed CHIP model and estimation procedure scales to large networks with tens of thousands of nodes and provides superior fits than existing continuous-time network models on several real networks.  more » « less
Award ID(s):
1830547
NSF-PAR ID:
10280779
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Advances in neural information processing systems
Volume:
33
ISSN:
1049-5258
Page Range / eLocation ID:
16983-16996
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider the problem of estimating the location of a single change point in a network generated by a dynamic stochastic block model mechanism. This model produces community structure in the network that exhibits change at a single time epoch. We propose two methods of estimating the change point, together with the model parameters, before and after its occurrence. The first employs a least-squares criterion function and takes into consideration the full structure of the stochastic block model and is evaluated at each point in time. Hence, as an intermediate step, it requires estimating the community structure based on a clustering algorithm at every time point. The second method comprises the following two steps: in the first one, a least-squares function is used and evaluated at each time point, but ignoring the community structure and only considering a random graph generating mechanism exhibiting a change point. Once the change point is identified, in the second step, all network data before and after it are used together with a clustering algorithm to obtain the corresponding community structures and subsequently estimate the generating stochastic block model parameters. The first method, since it requires knowledge of the community structure and hence clustering at every point in time, is significantly more computationally expensive than the second one. On the other hand, it requires a significantly less stringent identifiability condition for consistent estimation of the change point and the model parameters than the second method; however, it also requires a condition on the misclassification rate of misallocating network nodes to their respective communities that may fail to hold in many realistic settings. Despite the apparent stringency of the identifiability condition for the second method, we show that networks generated by a stochastic block mechanism exhibiting a change in their structure can easily satisfy this condition under a multitude of scenarios, including merging/splitting communities, nodes joining another community, etc. Further, for both methods under their respective identifiability and certain additional regularity conditions, we establish rates of convergence and derive the asymptotic distributions of the change point estimators. The results are illustrated on synthetic data. In summary, this work provides an in-depth investigation of the novel problem of change point analysis for networks generated by stochastic block models, identifies key conditions for the consistent estimation of the change point, and proposes a computationally fast algorithm that solves the problem in many settings that occur in applications. Finally, it discusses challenges posed by employing clustering algorithms in this problem, that require additional investigation for their full resolution. 
    more » « less
  2. The stochastic block model (SBM) is one of the most widely used generative models for network data. Many continuous-time dynamic network models are built upon the same assumption as the SBM: edges or events between all pairs of nodes are conditionally independent given the block or community memberships, which prevents them from reproducing higher-order motifs such as triangles that are commonly observed in real networks. We propose the multivariate community Hawkes (MULCH) model, an extremely flexible community-based model for continuous-time networks that introduces dependence between node pairs using structured multivariate Hawkes processes. We fit the model using a spectral clustering and likelihood-based local refinement procedure. We find that our proposed MULCH model is far more accurate than existing models both for predictive and generative tasks. 
    more » « less
  3. Abstract

    Dynamic community detection provides a coherent description of network clusters over time, allowing one to track the growth and death of communities as the network evolves. However, modularity maximization, a popular method for performing multilayer community detection, requires the specification of an appropriate null network as well as resolution and interlayer coupling parameters. Importantly, the ability of the algorithm to accurately detect community evolution is dependent on the choice of these parameters. In functional temporal networks, where evolving communities reflect changing functional relationships between network nodes, it is especially important that the detected communities reflect any state changes of the system. Here, we present analytical work suggesting that a uniform null network provides improved sensitivity to the detection of small evolving communities in temporal networks with positive edge weights bounded above by 1, such as certain types of correlation networks. We then propose a method for increasing the sensitivity of modularity maximization to state changes in nodal dynamics by modelling self-identity links between layers based on the self-similarity of the network nodes between layers. This method is more appropriate for functional temporal networks from both a modelling and mathematical perspective, as it incorporates the dynamic nature of network nodes. We motivate our method based on applications in neuroscience where network nodes represent neurons and functional edges represent similarity of firing patterns in time. We show that in simulated data sets of neuronal spike trains, updating interlayer links based on the firing properties of the neurons provides superior community detection of evolving network structure when groups of neurons change their firing properties over time. Finally, we apply our method to experimental calcium imaging data that monitors the spiking activity of hundreds of neurons to track the evolution of neuronal communities during a state change from the awake to anaesthetized state.

     
    more » « less
  4. Batteryless sensor nodes compute, sense, and communicate using only energy harvested from the ambient. These devices promise long maintenance free operation in hard to deploy scenarios, making them an attractive alternative to battery-powered wireless sensor networks. However, complications from frequent power failures due to unpredictable ambient energy stand in the way of robust network operation. Unlike continuously-powered systems, intermittently-powered batteryless nodes lose their time upon each reboot, along with all volatile memory, making synchronization and coordination difficult. In this paper, we consider the case where each batteryless sensor is equipped with a hourglass capacitor to estimate the elapsed time between power failures. Contrary to prior work that focused on providing a continuous notion of time for a single batteryless sensor, we consider a network of batteryless sensors and explore how to provide a network-wide, continuous, and synchronous notion of time. First, we build a mathematical model that represents the estimated time between power failures by using hourglass capacitors. This allowed us to simulate the local (and continuous) time of a single batteryless node. Second, we show--through simulations--the effect of hourglass capacitors and in turn the performance degradation of the state of the art synchronization protocol in wireless sensor networks in a network of batteryless devices. 
    more » « less
  5. Abstract

    Graph representation learning has revolutionized many artificial intelligence and machine learning tasks in recent years, ranging from combinatorial optimization, drug discovery, recommendation systems, image classification, social network analysis to natural language understanding. This paper shows their efficacy in modeling relationships between products and making predictions for unseen product networks. By representing products as nodes and their relationships as edges of a graph, we show how an inductive graph neural network approach, named GraphSAGE, can efficiently learn continuous representations for nodes and edges. These representations also capture product feature information such as price, brand, and engineering attributes. They are combined with a classification model for predicting the existence of a relationship between any two products. Using a case study of the Chinese car market, we find that our method yields double the F-1 score compared to an Exponential Random Graph Model-based method for predicting the co-consideration relationship between cars. While a vanilla Graph-SAGE requires a partial network to make predictions, we augment it with an ‘adjacency prediction model’ to circumvent this limitation. This enables us to predict product relationships when no neighborhood information is known. Finally, we demonstrate how a permutation-based interpretability analysis can provide insights on how design attributes impact the predictions of relationships between products. Overall, this work provides a systematic method to predict the relationships between products in a complex engineering system.

     
    more » « less