This paper proposes a new meta-learning method – named HARMLESS (HAwkes Relational Meta LEarning method for Short Sequences) for learning heterogeneous point process models from short event sequence data along with a relational network. Specifically, we propose a hierarchical Bayesian mixture Hawkes process model, which naturally incorporates the relational information among sequences into point process modeling. Compared with existing methods, our model can capture the underlying mixed-community patterns of the relational network, which simultaneously encourages knowledge sharing among sequences and facilitates adaptive learning for each individual sequence. We further propose an efficient stochastic variational meta expectation maximization algorithm that can scale to large problems. Numerical experiments on both synthetic and real data show that HARMLESS outperforms existing methods in terms of predicting the future events.
more »
« less
A Dirichlet Mixture Model of Hawkes Processes for Event Sequence Clustering
How to cluster event sequences generated via different point processes is an interesting and important problem in statistical machine learning. To solve this problem, we propose and discuss an effective model-based clustering method based on a novel Dirichlet mixture model of a special but significant type of point processes — Hawkes process. The proposed model generates the event sequences with different clusters from the Hawkes processes with different parameters, and uses a Dirichlet distribution as the prior distribution of the clusters. We prove the identifiability of our mixture model and propose an effective variational Bayesian inference algorithm to learn our model. An adaptive inner iteration allocation strategy is designed to accelerate the convergence of our algorithm. Moreover, we investigate the sample complexity and the computational complexity of our learning algorithm in depth. Experiments on both synthetic and real-world data show that the clustering method based on our model can learn structural triggering patterns hidden in asynchronous event sequences robustly and achieve superior performance on clustering purity and consistency compared to existing methods.
more »
« less
- Award ID(s):
- 1745382
- PAR ID:
- 10190740
- Date Published:
- Journal Name:
- Advances in neural information processing systems
- ISSN:
- 1049-5258
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This paper proposes a new meta-learning method – named HARMLESS (HAwkes Relational Meta LEarning method for Short Sequences) for learning heterogeneous point process models from short event sequence data along with a relational net- work. Specifically, we propose a hierarchical Bayesian mixture Hawkes process model, which naturally incorporates the relational information among sequences into point process modeling. Compared with existing methods, our model can capture the underlying mixed-community patterns of the relational network, which simultaneously encourages knowledge sharing among sequences and facilitates adaptive learning for each individual sequence. We further propose an efficient stochastic variational meta expectation maximization algorithm that can scale to large problems. Numerical experiments on both synthetic and real data show that HARMLESS outperforms existing methods in terms of predicting the future events.more » « less
-
Abstract Identification of clusters of co‐expressed genes in transcriptomic data is a difficult task. Most algorithms used for this purpose can be classified into two broad categories: distance‐based or model‐based approaches. Distance‐based approaches typically utilize a distance function between pairs of data objects and group similar objects together into clusters. Model‐based approaches are based on using the mixture‐modeling framework. Compared to distance‐based approaches, model‐based approaches offer better interpretability because each cluster can be explicitly characterized in terms of the proposed model. However, these models present a particular difficulty in identifying a correct multivariate distribution that a mixture can be based upon. In this manuscript, we review some of the approaches used to select a distribution for the needed mixture model first. Then, we propose avoiding this problem altogether by using a nonparametric MSL (maximum smoothed likelihood) algorithm. This algorithm was proposed earlier in statistical literature but has not been, to the best of our knowledge, applied to transcriptomics data. The salient feature of this approach is that it avoids explicit specification of distributions of individual biological samples altogether, thus making the task of a practitioner easier. We performed both a simulation study and an application of the proposed algorithm to two different real datasets. When used on a real dataset, the algorithm produces a large number of biologically meaningful clusters and performs at least as well as several other mixture‐based algorithms commonly used for RNA‐seq data clustering. Our results also show that this algorithm is capable of uncovering clustering solutions that may go unnoticed by several other model‐based clustering algorithms. Our code is publicly available on Github at https://github.com/Matematikoi/non_parametric_clusteringmore » « less
-
Hawkes processes have been shown to be efficient in modeling bursty sequences in a variety of applications, such as finance and social network activity analysis. Traditionally, these models parameterize each process independently and assume that the history of each point process can be fully observed. Such models could however be inefficient or even prohibited in certain real-world applications, such as in the field of education, where such assumptions are violated. Motivated by the problem of detecting and predicting student procrastination in students Massive Open Online Courses (MOOCs) with missing and partially observed data, in this work, we propose a novel personalized Hawkes process model (RCHawkes-Gamma) that discovers meaningful student behavior clusters by jointly learning all partially observed processes simultaneously, without relying on auxiliary features. Our experiments on both synthetic and real-world education datasets show that RCHawkes-Gamma can effectively recover student clusters and their temporal procrastination dynamics, resulting in better predictive performance of future student activities. Our further analyses of the learned parameters and their association with student delays show that the discovered student clusters unveil meaningful representations of various procrastination behaviors in students.more » « less
-
Large quantities of asynchronous event sequence data such as crime records, emergence call logs, and financial transactions are becoming increasingly available from various fields. These event sequences often exhibit both long-term and short-term temporal dependencies. Variations of neural network based temporal point processes have been widely used for modeling such asynchronous event sequences. However, many current architectures including attention based point processes struggle with long event sequences due to computational inefficiency. To tackle the challenge, we propose an efficient sparse transformer Hawkes process (STHP), which has two components. For the first component, a transformer with a novel temporal sparse self-attention mechanism is applied to event sequences with arbitrary intervals, mainly focusing on short-term dependencies. For the second component, a transformer is applied to the time series of aggregated event counts, primarily targeting the extraction of long-term periodic dependencies. Both components complement each other and are fused together to model the conditional intensity function of a point process for future event forecasting. Experiments on real-world datasets show that the proposed STHP outperforms baselines and achieves significant improvement in computational efficiency without sacrificing prediction performance for long sequences.more » « less