This article presents a Hawkes process model with Markovian baseline intensi- ties for high-frequency order book data modeling. We classied intraday order book trading events into a range of categories based on their order types and the price change after their arrivals. In order to capture the stimulating eects between mul- tiple types of order book events, we use multivariate Hawkes process to model the self- and mutually-exciting event arrivals. We also integrate a Markovian baseline intensities into the event arrival dynamic, by including the impacts of order book liquidity state and time factor on the baseline intensity. A regression-based non- parametric estimation procedure is adopted to estimate the model parameters in our Hawkes+Markovian model. To eliminate redundant model parameters, LASSO reg- ularization is incorporated into the estimation procedure. Besides, model selection method based on Akaike Information Criteria is applied to evaluate the eect of each part of the proposed model. An implementation example based on real LOB data is provided. Through the example we studied the empirical shapes of Hawkes excitement functions, the eects of liquidity as well as time factors, the LASSO vari- able selection, and the explanation power of Hawkes and Markovian elements to the dynamics of order book.
more »
« less
Fast estimation of multivariate spatiotemporal Hawkes processes and network reconstruction
We present a fast, accurate estimation method for multivariate Hawkes self-exciting point processes widely used in seismology, criminology, finance and other areas. There are two major ingredients. The first is an analytic derivation of exact maximum likelihood estimates of the nonparametric triggering density. We develop this for the multivariate case and add regularization to improve stability and robustness. The second is a moment-based method for the background rate and triggering matrix estimation, which is extended here for the spatiotemporal case. Our method combines them together in an efficient way, and we prove the consistency of this new approach. Extensive numerical experiments, with synthetic data and real-world social network data, show that our method improves the accuracy, scalability and computational efficiency of prevailing estimation approaches. Moreover, it greatly boosts the performance of Hawkes process-based models on social network reconstruction and helps to understand the spatiotemporal triggering dynamics over social media.
more »
« less
- PAR ID:
- 10222747
- Date Published:
- Journal Name:
- Annals of the Institute of Statistical Mathematics
- ISSN:
- 0020-3157
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The stochastic block model (SBM) is one of the most widely used generative models for network data. Many continuous-time dynamic network models are built upon the same assumption as the SBM: edges or events between all pairs of nodes are conditionally independent given the block or community memberships, which prevents them from reproducing higher-order motifs such as triangles that are commonly observed in real networks. We propose the multivariate community Hawkes (MULCH) model, an extremely flexible community-based model for continuous-time networks that introduces dependence between node pairs using structured multivariate Hawkes processes. We fit the model using a spectral clustering and likelihood-based local refinement procedure. We find that our proposed MULCH model is far more accurate than existing models both for predictive and generative tasks.more » « less
-
null (Ed.)Interpretable models for criminal justice forecasting are desirable due to the high-stakes nature of the application. While interpretable models have been developed for individual level forecasts of recidivism, interpretable models are lacking for the application of space-time crime hotspot forecasting. Here we introduce an interpretable Hawkes process model of crime that allows forecasts to capture near-repeat effects and spatial heterogeneity while being consumable in the form of easy-to-read score cards. For this purpose we employ penalized likelihood estimation of the point process with a total-variation regularization that enforces the triggering kernel to be piece-wise constant. We derive an efficient expectation-maximization algorithm coupled with forward backward splitting for the TV constraint to estimate the model. We apply our methodology to synthetic data and space-time crime data from Indianapolis. The TV-Hawkes process achieves similar accuracy to standard Hawkes process models of crime while increasing interpretability and transparency.more » « less
-
How to cluster event sequences generated via different point processes is an interesting and important problem in statistical machine learning. To solve this problem, we propose and discuss an effective model-based clustering method based on a novel Dirichlet mixture model of a special but significant type of point processes — Hawkes process. The proposed model generates the event sequences with different clusters from the Hawkes processes with different parameters, and uses a Dirichlet distribution as the prior distribution of the clusters. We prove the identifiability of our mixture model and propose an effective variational Bayesian inference algorithm to learn our model. An adaptive inner iteration allocation strategy is designed to accelerate the convergence of our algorithm. Moreover, we investigate the sample complexity and the computational complexity of our learning algorithm in depth. Experiments on both synthetic and real-world data show that the clustering method based on our model can learn structural triggering patterns hidden in asynchronous event sequences robustly and achieve superior performance on clustering purity and consistency compared to existing methods.more » « less
-
null (Ed.)In many application settings involving networks, such as messages between users of an on-line social network or transactions between traders in financial markets, the observed data consist of timestamped relational events, which form a continuous-time network. We propose the Community Hawkes Independent Pairs (CHIP) generative model for such networks. We show that applying spectral clustering to an aggregated adjacency matrix constructed from the CHIP model provides consistent community detection for a growing number of nodes and time duration. We also develop consistent and computationally efficient estimators for the model parameters. We demonstrate that our proposed CHIP model and estimation procedure scales to large networks with tens of thousands of nodes and provides superior fits than existing continuous-time network models on several real networks.more » « less
An official website of the United States government

