skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The Fréchet Mean of Inhomogeneous Random Graphs
To characterize the “average” of a set of graphs, one can compute the sample Fr ́echet mean. We prove the following result: if we use the Hamming distance to compute distances between graphs, then the Fr ́echet mean of an ensemble of inhomogeneous random graphs is obtained by thresholding the expected adjacency matrix: an edge exists between the vertices i and j in the Fr ́echet mean graph if and only if the corresponding entry of the expected adjacency matrix is greater than 1/2. We prove that the result also holds for the sample Fr ́echet mean when the expected adjacency matrix is replaced with the sample mean adjacency matrix. This novel theoretical result has some significant practical consequences; for instance, the Fr ́echet mean of an ensemble of sparse inhomogeneous random graphs is the empty graph.  more » « less
Award ID(s):
1815971
PAR ID:
10318695
Author(s) / Creator(s):
Editor(s):
Benito, Rosa Maria; Cherifi, Chantal; Cherifi, Hocine; Moro, Esteban; Rocha, Luis M.
Date Published:
Journal Name:
Complex Networks & Their Applications X
ISSN:
1860-949X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Banerjee, Arindam; Zhou, Zhi-Hua (Ed.)
    To characterize the location (mean, median) of a set of graphs, one needs a notion of centrality that is adapted to metric spaces, since graph sets are not Euclidean spaces. A standard approach is to consider the Fr ́echet mean. In this work, we equip a set of graph with the pseudometric defined by the l2 norm between the eigenvalues of their respective adjacency matrix. Unlike the edit distance, this pseudometric reveals structural changes at multiple scales, and is well adapted to studying various statistical problems for graph-valued data. We describe an algorithm to compute an approximation to the sample Fr ́echet mean of a set of undirected unweighted graphs with a fixed size using this pseudometric. 
    more » « less
  2. Ribeiro, Pedro; Silva, Fernando; Mendes, José Fernando; Laureano, Rosário (Ed.)
    The availability of large datasets composed of graphs creates an unprecedented need to invent novel tools in statistical learning for graph-valued random variables. To characterize the average of a sample of graphs, one can compute the sample Frechet mean and median graphs. In this paper, we address the following foundational question: does a mean or median graph inherit the structural properties of the graphs in the sample? An important graph property is the edge density; we establish that edge density is an hereditary property, which can be transmitted from a graph sample to its sample Frechet mean or median graphs, irrespective of the method used to estimate the mean or the median. Because of the prominence of the Frechet mean in graph-valued machine learning, this novel theoretical result has some significant practical consequences. 
    more » « less
  3. We develop a unified approach to bounding the largest and smallest singular values of an inhomogeneous random rectangular matrix, based on the non-backtracking operator and the Ihara-Bass formula for general random Hermitian matrices with a bipartite block structure. We obtain probabilistic upper (respectively, lower) bounds for the largest (respectively, smallest) singular values of a large rectangular random matrix X. These bounds are given in terms of the maximal and minimal 2-norms of the rows and columns of the variance profile of X. The proofs involve finding probabilistic upper bounds on the spectral radius of an associated non-backtracking matrix B. The two-sided bounds can be applied to the centered adjacency matrix of sparse inhomogeneous Erd˝os-Rényi bipartite graphs for a wide range of sparsity, down to criticality. In particular, for Erd˝os-Rényi bipartite graphs G(n,m, p) with p = ω(log n)/n, and m/n→ y ∈ (0,1), our sharp bounds imply that there are no outliers outside the support of the Marˇcenko-Pastur law almost surely. This result extends the Bai-Yin theorem to sparse rectangular random matrices. 
    more » « less
  4. We examine topological properties of spaces of paths and graphs mapped to $$\R^d$$ under the Fr\'echet distance. We show that these spaces are path-connected if the map is either continuous or an immersion. If the map is an embedding, we show that the space of paths is path-connected, while the space of graphs only maintains this property in dimensions four or higher. 
    more » « less
  5. We study lower bounds for the problem of approximating a one dimensional distribution given (noisy) measurements of its moments. We show that there are distributions on $[-1,1]$ that cannot be approximated to accuracy $$\epsilon$$ in Wasserstein-1 distance even if we know \emph{all} of their moments to multiplicative accuracy $$(1\pm2^{-\Omega(1/\epsilon)})$$; this result matches an upper bound of Kong and Valiant [Annals of Statistics, 2017]. To obtain our result, we provide a hard instance involving distributions induced by the eigenvalue spectra of carefully constructed graph adjacency matrices. Efficiently approximating such spectra in Wasserstein-1 distance is a well-studied algorithmic problem, and a recent result of Cohen-Steiner et al. [KDD 2018] gives a method based on accurately approximating spectral moments using $$2^{O(1/\epsilon)}$$ random walks initiated at uniformly random nodes in the graph.As a strengthening of our main result, we show that improving the dependence on $$1/\epsilon$$ in this result would require a new algorithmic approach. Specifically, no algorithm can compute an $$\epsilon$$-accurate approximation to the spectrum of a normalized graph adjacency matrix with constant probability, even when given the transcript of $$2^{\Omega(1/\epsilon)}$$ random walks of length $$2^{\Omega(1/\epsilon)}$$ started at random nodes. 
    more » « less