skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Statistical inference for continuous‐time Markov processes with block structure based on discrete‐time network data
A widely used approach to modeling discrete‐time network data assumes that discrete‐time network data are generated by an unobserved continuous‐time Markov process. While such models can capture a wide range of network phenomena and are popular in social network analysis, the models are based on the homogeneity assumption that all nodes share the same parameters. We remove the homogeneity assumption by allowing nodes to belong to unobserved subsets of nodes, called blocks, and assuming that nodes in the same block have the same parameters, whereas nodes in distinct blocks have distinct parameters. The resulting models capture unobserved heterogeneity across nodes and admit model‐based clustering of nodes based on network properties chosen by researchers. We develop Bayesian data‐augmentation methods and apply them to discrete‐time observations of an ownership network of non‐financial companies in Slovenia in its critical transition from a socialist economy to a market economy. We detect a small subset of shadow‐financial companies that outpaces others in terms of the rate of change and the desire to accumulate stocks of other companies.  more » « less
Award ID(s):
1812119 1513644
PAR ID:
10456620
Author(s) / Creator(s):
 
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Statistica Neerlandica
Volume:
74
Issue:
3
ISSN:
0039-0402
Page Range / eLocation ID:
p. 342-362
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    In many application settings involving networks, such as messages between users of an on-line social network or transactions between traders in financial markets, the observed data consist of timestamped relational events, which form a continuous-time network. We propose the Community Hawkes Independent Pairs (CHIP) generative model for such networks. We show that applying spectral clustering to an aggregated adjacency matrix constructed from the CHIP model provides consistent community detection for a growing number of nodes and time duration. We also develop consistent and computationally efficient estimators for the model parameters. We demonstrate that our proposed CHIP model and estimation procedure scales to large networks with tens of thousands of nodes and provides superior fits than existing continuous-time network models on several real networks. 
    more » « less
  2. The stochastic block model (SBM) is one of the most widely used generative models for network data. Many continuous-time dynamic network models are built upon the same assumption as the SBM: edges or events between all pairs of nodes are conditionally independent given the block or community memberships, which prevents them from reproducing higher-order motifs such as triangles that are commonly observed in real networks. We propose the multivariate community Hawkes (MULCH) model, an extremely flexible community-based model for continuous-time networks that introduces dependence between node pairs using structured multivariate Hawkes processes. We fit the model using a spectral clustering and likelihood-based local refinement procedure. We find that our proposed MULCH model is far more accurate than existing models both for predictive and generative tasks. 
    more » « less
  3. Johnson, Kristin N.; Reyes, Carla L. (Ed.)
    Privacy regulation has traditionally been the remit of consumer protection, and privacy harm is cast as a contractual harm arising from the interpersonal exchanges between data subjects and data collectors. This frames surveillance of people by companies as primarily a consumer harm. In this article, we argue that the modern economy of personal data is better understood as an extension of the financial system. The data economy intersects with capital markets in ways that may increase systemic and systematic financial risks. We contribute a new regulatory approach to privacy harms: as a source of risk correlated across households, firms and the economy as a whole. We consider adapting tools from macroprudential regulations designed to mitigate financial crises to the market for personal data. We identify both promises and pitfalls to viewing individual privacy through the lens of the financial system. 
    more » « less
  4. We consider the problem of analyzing timestamped relational events between a set of entities, such as messages between users of an on-line social network. Such data are often analyzed using static or discrete-time network models, which discard a significant amount of information by aggregating events over time to form network snapshots. In this paper, we introduce a block point process model (BPPM) for continuous-time event-based dynamic networks. The BPPM is inspired by the well-known stochastic block model (SBM) for static networks. We show that networks generated by the BPPM follow an SBM in the limit of a growing number of nodes. We use this property to develop principled and efficient local search and variational inference procedures initialized by regularized spectral clustering. We fit BPPMs with exponential Hawkes processes to analyze several real network data sets, including a Facebook wall post network with over 3,500 nodes and 130,000 events. 
    more » « less
  5. Undirected, binary network data consist of indicators of symmetric relations between pairs of actors. Regression models of such data allow for the estimation of effects of exogenous covariates on the network and for prediction of unobserved data. Ideally, estimators of the regression parameters should account for the inherent dependencies among relations in the network that involve the same actor. To account for such dependencies, researchers have developed a host of latent variable network models; however, estimation of many latent variable network models is computationally onerous and which model is best to base inference upon may not be clear. We propose the probit exchangeable (PX) model for undirected binary network data that is based on an assumption of exchangeability, which is common to many of the latent variable network models in the literature. The PX model can represent the first two moments of any exchangeable network model. We leverage the EM algorithm to obtain an approximate maximum likelihood estimator of the PX model that is extremely computationally efficient. Using simulation studies, we demonstrate the improvement in estimation of regression coefficients of the proposed model over existing latent variable network models. In an analysis of purchases of politically aligned books, we demonstrate political polarization in purchase behavior and show that the proposed estimator significantly reduces runtime relative to estimators of latent variable network models, while maintaining predictive performance. 
    more » « less