Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
The popular Random Dot Product Graph (RDPG) generative model postulates that each node has an associated (latent) vector, and the probability of existence of an edge between two nodes is their inner-product (with variants to consider directed and weighted graphs). In any case, the latent vectors may be estimated through a spectral decomposition of the adjacency matrix, the so-called Adjacency Spectral Embedding (ASE). Assume we are monitoring a stream of graphs and the objective is to track the latent vectors. Examples include recommender systems or monitoring of a wireless network. It is clear that performing the ASE of each graph separately may result in a prohibitive computation load. Furthermore, the invariance to rotations of the inner product complicates comparing the latent vectors at different time-steps. By considering the minimization problem underlying ASE, we develop an iterative algorithm that updates the latent vectors' estimation as new graphs from the stream arrive. Differently to other proposals, our method does not accumulate errors and thus does not requires periodically re-computing the spectral decomposition. Furthermore, the pragmatic setting where nodes leave or join the graph (e.g. a new product in the recommender system) can be accommodated as well. Our code is available at https://github.com/marfiori/efficient-ASEmore » « less
-
The Random Dot Product Graph (RDPG) is a popular generative graph model for relational data. RDPGs postulate there exist latent positions for each node, and specifies the edge formation probabilities via the inner product of the corresponding latent vectors. The embedding task of estimating these latent positions from observed graphs is usually posed as a non-convex matrix factorization problem. The workhorse Adjacency Spectral Embedding offers an approximate solution obtained via the eigendecomposition of the adjacency matrix, which enjoys solid statistical guarantees but can be computationally intensive and is formally solving a surrogate problem. In this paper, we bring to bear recent non-convex optimization advances and demonstrate their impact to RDPG inference. We develop first-order gradient descent methods to better solve the original optimization problem, and to accommodate broader network embedding applications in an organic way. The effectiveness of the resulting graph representation learning framework is demonstrated on both synthetic and real data. We show the algorithms are scalable, robust to missing network data, and can track the latent positions over time when the graphs are acquired in a streaming fashion.more » « less
-
Given a sequence of random graphs, we address the problem of online monitoring and detection of changes in the underlying data distribution. To this end, we adopt the Random Dot Product Graph (RDPG) model which postulates each node has an associated latent vector, and inner products between these vectors dictate the edge formation probabilities. Existing approaches for graph change-point detection (CPD) rely either on extensive computation, or they store and process the entire observed time series. In this paper we consider the cumulative sum of a judicious monitoring function, which quantifies the discrepancy between the streaming graph observations and the nominal model. This reference distribution is inferred via spectral embeddings of the first few graphs in the sequence, and the monitoring function can be updated in an efficient, online fashion. We characterize the distribution of this running statistic, allowing us to select appropriate thresholding parameters that guarantee error-rate control. The end result is a lightweight online CPD algorithm, with a proven capability to flag distribution shifts in the arriving graphs. The novel method is tested on both synthetic and real network data, corroborating its effectiveness in quickly detecting changes in the input graph sequence.more » « less
-
Given a sequence of possibly correlated randomly generated graphs, we address the problem of detecting changes on their underlying distribution. To this end, we will consider Random Dot Product Graphs (RDPGs), a simple yet rich family of random graphs that subsume Erdös-Rényi and Stochastic Block Model ensembles as particular cases. In RDPGs each node has an associated latent vector and inner products between these vectors dictate the edge existence probabilities. Previous works have mostly focused on the undirected and unweighted graph case, a gap we aim to close here. We first extend the RDPG model to accommodate directed and weighted graphs, a contribution whose interest transcends change-point detection (CPD). A statistic derived from the nodes' estimated latent vectors (i.e., embeddings) facilitates adoption of scalable geometric CPD techniques. The resulting algorithm yields interpretable results and facilitates pinpointing which (and when) nodes are acting differently. Numerical tests on simulated data as well as on a real dataset of graphs stemming from a Wi-Fi network corroborate the effectiveness of the proposed CPD method.more » « less
An official website of the United States government
