Given a sequence of possibly correlated randomly generated graphs, we address the problem of detecting changes on their underlying distribution. To this end, we will consider Random Dot Product Graphs (RDPGs), a simple yet rich family of random graphs that subsume Erdös-Rényi and Stochastic Block Model ensembles as particular cases. In RDPGs each node has an associated latent vector and inner products between these vectors dictate the edge existence probabilities. Previous works have mostly focused on the undirected and unweighted graph case, a gap we aim to close here. We first extend the RDPG model to accommodate directed and weighted graphs, a contribution whose interest transcends change-point detection (CPD). A statistic derived from the nodes' estimated latent vectors (i.e., embeddings) facilitates adoption of scalable geometric CPD techniques. The resulting algorithm yields interpretable results and facilitates pinpointing which (and when) nodes are acting differently. Numerical tests on simulated data as well as on a real dataset of graphs stemming from a Wi-Fi network corroborate the effectiveness of the proposed CPD method.
more »
« less
Online Change Point Detection for Random Dot Product Graphs
Given a sequence of random graphs, we address the problem of online monitoring and detection of changes in the underlying data distribution. To this end, we adopt the Random Dot Product Graph (RDPG) model which postulates each node has an associated latent vector, and inner products between these vectors dictate the edge formation probabilities. Existing approaches for graph change-point detection (CPD) rely either on extensive computation, or they store and process the entire observed time series. In this paper we consider the cumulative sum of a judicious monitoring function, which quantifies the discrepancy between the streaming graph observations and the nominal model. This reference distribution is inferred via spectral embeddings of the first few graphs in the sequence, and the monitoring function can be updated in an efficient, online fashion. We characterize the distribution of this running statistic, allowing us to select appropriate thresholding parameters that guarantee error-rate control. The end result is a lightweight online CPD algorithm, with a proven capability to flag distribution shifts in the arriving graphs. The novel method is tested on both synthetic and real network data, corroborating its effectiveness in quickly detecting changes in the input graph sequence.
more »
« less
- PAR ID:
- 10321467
- Date Published:
- Journal Name:
- 2021 55th Asilomar Conference on Signals, Systems, and Computers
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Random graphs (or networks) have gained a significant increase of interest due to its popularity in modeling and simulating many complex real-world systems. Degree sequence is one of the most important aspects of these systems. Random graphs with a given degree sequence can capture many characteristics like dependent edges and non-binomial degree distribution that are absent in many classical random graph models such as the Erdöos-Rényi graph model. In addition, they have important applications in uniform sampling of random graphs, counting the number of graphs having the same degree sequence, as well as in string theory, random matrix theory, and matching theory. In this paper, we present an OpenMP-based shared-memory parallel algorithm for generating a random graph with a prescribed degree sequence, which achieves a speedup of 20.4 with 32 cores. We also present a comparative study of several structural properties of the random graphs generated by our algorithm with that of the real-world graphs and random graphs generated by other popular methods. One of the steps in our parallel algorithm requires checking the Erdöos-Gallai characterization, i.e., whether there exists a graph obeying the given degree sequence, in parallel. This paper presents a non-trivial parallel algorithm for checking the Erdöos-Gallai characterization, which achieves a speedup of 23 with 32 cores.more » « less
-
Abstract Consider a set of n vertices, where each vertex has a location in $$\mathbb{R}^d$$ that is sampled uniformly from the unit cube in $$\mathbb{R}^d$$ , and a weight associated to it. Construct a random graph by placing edges independently for each vertex pair with a probability that is a function of the distance between the locations and the vertex weights. Under appropriate integrability assumptions on the edge probabilities that imply sparseness of the model, after appropriately blowing up the locations, we prove that the local limit of this random graph sequence is the (countably) infinite random graph on $$\mathbb{R}^d$$ with vertex locations given by a homogeneous Poisson point process, having weights which are independent and identically distributed copies of limiting vertex weights. Our set-up covers many sparse geometric random graph models from the literature, including geometric inhomogeneous random graphs (GIRGs), hyperbolic random graphs, continuum scale-free percolation, and weight-dependent random connection models. We prove that the limiting degree distribution is mixed Poisson and the typical degree sequence is uniformly integrable, and we obtain convergence results on various measures of clustering in our graphs as a consequence of local convergence. Finally, as a byproduct of our argument, we prove a doubly logarithmic lower bound on typical distances in this general setting.more » « less
-
As the developers of malware continuously evolve their attacks and infection methods, so to must bot detection methods advance. Graph Neural Networks (GNNs) have emerged as a promising detection method. However, in most cases communications graphs reflecting bot-infected networks are plagued with class imbalance and a high level of heterophily. Graph oversampling techniques employed to tackle class imbalance on graphs have drawbacks, such as introducing noisy topological structures or exacerbating heterophily within the graph. Out-of-distribution detection (ODD) is considered as an alternative solution to address data imbalance issues, but when applied to graphs, it assumes that the underlying graph structure does not interfere with the learning of data distributions. In this paper, we present the first application of ODD methods for bot detection in a network. We propose a new energy-based ODD model, which surpasses existing ODD methods, including those tailored for ODD on graph data, and effectively mitigates performance degradation caused by graph heterophily. We substantiate our claims through extensive experiments on the TON IoT dataset, which comprises real captured bot data. The experimental results demonstrate that our model achieves state-of-the-art performance in bot detection on graphs with high graph heterophily and extreme class imbalance.more » « less
-
Rasmussen, M.; Reich, S.; Zaslavski, A. J. (Ed.)We study patterns observed right after the loss of stability of mixing in the Kuramoto model of coupled phase oscillators with random intrinsic frequencies on large graphs, which can also be random. We show that the emergent patterns are formed via two independent mechanisms determined by the shape of the frequency distribution and the limiting structure of the underlying graph sequence. Specifically, we identify two nested eigenvalue problems whose eigenvectors (unstable modes) determine the structure of the nascent patterns. The analysis is illustrated with the results of the numerical experiments with the Kuramoto model with unimodal and bimodal frequency distributions on certain graphs.more » « less
An official website of the United States government

