skip to main content


Title: Learning network event sequences using long short‐term memory and second‐order statistic loss
Abstract

Modeling temporal event sequences on the vertices of a network is an important problem with widespread applications; examples include modeling influences in social networks, preventing crimes by modeling their space–time occurrences, and forecasting earthquakes. Existing solutions for this problem use a parametric approach, whose applicability is limited to event sequences following some well‐known distributions, which is not true for many real life event datasets. To overcome this limitation, in this work, we propose a composite recurrent neural network model for learning events occurring in the vertices of a network over time. Our proposed model combines two long short‐term memory units to capture base intensity and conditional intensity of an event sequence. We also introduce a second‐order statistic loss that penalizes higher divergence between the generated and the target sequence's distribution of hop count distance of consecutive events. Given a sequence of vertices of a network in which an event has occurred, the proposed model predicts the vertex where the next event would most likely occur. Experimental results on synthetic and real‐world datasets validate the superiority of our proposed model in comparison to various baseline methods.

 
more » « less
Award ID(s):
1737585 1737996
NSF-PAR ID:
10382535
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Statistical Analysis and Data Mining: The ASA Data Science Journal
Volume:
14
Issue:
1
ISSN:
1932-1864
Format(s):
Medium: X Size: p. 61-73
Size(s):
p. 61-73
Sponsoring Org:
National Science Foundation
More Like this
  1. Large quantities of asynchronous event sequence data such as crime records, emergence call logs, and financial transactions are becoming increasingly available from various fields. These event sequences often exhibit both long-term and short-term temporal dependencies. Variations of neural network based temporal point processes have been widely used for modeling such asynchronous event sequences. However, many current architectures including attention based point processes struggle with long event sequences due to computational inefficiency. To tackle the challenge, we propose an efficient sparse transformer Hawkes process (STHP), which has two components. For the first component, a transformer with a novel temporal sparse self-attention mechanism is applied to event sequences with arbitrary intervals, mainly focusing on short-term dependencies. For the second component, a transformer is applied to the time series of aggregated event counts, primarily targeting the extraction of long-term periodic dependencies. Both components complement each other and are fused together to model the conditional intensity function of a point process for future event forecasting. Experiments on real-world datasets show that the proposed STHP outperforms baselines and achieves significant improvement in computational efficiency without sacrificing prediction performance for long sequences. 
    more » « less
  2. Large quantities of asynchronous event sequence data such as crime records, emergence call logs, and financial transactions are becoming increasingly available from various fields. These event sequences often exhibit both long-term and short-term temporal dependencies. Variations of neural network based temporal point processes have been widely used for modeling such asynchronous event sequences. However, many current architectures including attention based point processes struggle with long event sequences due to computational inefficiency. To tackle the challenge, we propose an efficient sparse transformer Hawkes process (STHP), which has two components. For the first component, a transformer with a novel temporal sparse self-attention mechanism is applied to event sequences with arbitrary intervals, mainly focusing on short-term dependencies. For the second component, a transformer is applied to the time series of aggregated event counts, primarily targeting the extraction of long-term periodic dependencies. Both components complement each other and are fused together to model the conditional intensity function of a point process for future event forecasting. Experiments on real-world datasets show that the proposed STHP outperforms baselines and achieves significant improvement in computational efficiency without sacrificing prediction performance for long sequences. 
    more » « less
  3. Continuous-time event data are common in applications such as individual behavior data, financial transactions, and medical health records. Modeling such data can be very challenging, in particular for applications with many different types of events, since it requires a model to predict the event types as well as the time of occurrence. Recurrent neural networks that parameterize time-varying intensity functions are the current state-of-the-art for predictive modeling with such data. These models typically assume that all event sequences come from the same data distribution. However, in many applications event sequences are generated by different sources, or users, and their characteristics can be very different. In this paper, we extend the broad class of neural marked point process models to mixtures of latent embeddings, where each mixture component models the characteristic traits of a given user. Our approach relies on augmenting these models with a latent variable that encodes user characteristics, represented by a mixture model over user behavior that is trained via amortized variational inference. We evaluate our methods on four large real-world datasets and demonstrate systematic improvements from our approach over existing work for a variety of predictive metrics such as log-likelihood, next event ranking, and source-of-sequence identification. 
    more » « less
  4. This paper proposes a new meta-learning method – named HARMLESS (HAwkes Relational Meta LEarning method for Short Sequences) for learning heterogeneous point process models from short event sequence data along with a relational network. Specifically, we propose a hierarchical Bayesian mixture Hawkes process model, which naturally incorporates the relational information among sequences into point process modeling. Compared with existing methods, our model can capture the underlying mixed-community patterns of the relational network, which simultaneously encourages knowledge sharing among sequences and facilitates adaptive learning for each individual sequence. We further propose an efficient stochastic variational meta expectation maximization algorithm that can scale to large problems. Numerical experiments on both synthetic and real data show that HARMLESS outperforms existing methods in terms of predicting the future events. 
    more » « less
  5. Abstract

    Event logs, comprising data on the occurrence of different types of events and associated times, are commonly collected during the operation of modern industrial machines and systems. It is widely believed that the rich information embedded in event logs can be used to predict the occurrence of critical events. In this paper, we propose a recurrent neural network model using time‐to‐event data from event logs not only to predict the time of the occurrence of a target event of interest, but also to interpret, from the trained model, significant events leading to the target event. To improve the performance of our model, sampling techniques and methods dealing with the censored data are utilized. The proposed model is tested on both simulated data and real‐world datasets. Through these comparison studies, we show that the deep learning approach can often achieve better prediction performance than the traditional statistical model, such as, the Cox proportional hazard model. The real‐world case study also shows that the model interpretation algorithm proposed in this work can reveal the underlying physical relationship among events.

     
    more » « less