skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, May 23 until 2:00 AM ET on Friday, May 24 due to maintenance. We apologize for the inconvenience.


Title: Computation of the Sample Frechet Mean for Sets of Large Graphs with Applications to Regression
To characterize the location (mean, median) of a set of graphs, one needs a notion of centrality that is adapted to metric spaces, since graph sets are not Euclidean spaces. A standard approach is to consider the Fr ́echet mean. In this work, we equip a set of graph with the pseudometric defined by the l2 norm between the eigenvalues of their respective adjacency matrix. Unlike the edit distance, this pseudometric reveals structural changes at multiple scales, and is well adapted to studying various statistical problems for graph-valued data. We describe an algorithm to compute an approximation to the sample Fr ́echet mean of a set of undirected unweighted graphs with a fixed size using this pseudometric.  more » « less
Award ID(s):
1815971
NSF-PAR ID:
10318701
Author(s) / Creator(s):
;
Editor(s):
Banerjee, Arindam; Zhou, Zhi-Hua
Date Published:
Journal Name:
SIAM International Conference on Data Mining (SDM 2022)
ISSN:
978-1-611977-17-2
Page Range / eLocation ID:
379-387
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Benito, Rosa Maria ; Cherifi, Chantal ; Cherifi, Hocine ; Moro, Esteban ; Rocha, Luis M. (Ed.)
    To characterize the “average” of a set of graphs, one can compute the sample Fr ́echet mean. We prove the following result: if we use the Hamming distance to compute distances between graphs, then the Fr ́echet mean of an ensemble of inhomogeneous random graphs is obtained by thresholding the expected adjacency matrix: an edge exists between the vertices i and j in the Fr ́echet mean graph if and only if the corresponding entry of the expected adjacency matrix is greater than 1/2. We prove that the result also holds for the sample Fr ́echet mean when the expected adjacency matrix is replaced with the sample mean adjacency matrix. This novel theoretical result has some significant practical consequences; for instance, the Fr ́echet mean of an ensemble of sparse inhomogeneous random graphs is the empty graph. 
    more » « less
  2. We examine topological properties of spaces of paths and graphs mapped to $\R^d$ under the Fr\'echet distance. We show that these spaces are path-connected if the map is either continuous or an immersion. If the map is an embedding, we show that the space of paths is path-connected, while the space of graphs only maintains this property in dimensions four or higher. 
    more » « less
  3. To characterize the location (mean, median) of a set of graphs, one needs a notion of centrality that has been adapted to metric spaces. A standard approach is to consider the Fréchet mean. In practice, computing the Fréchet mean for sets of large graphs presents many computational issues. In this work, we suggest a method that may be used to compute the Fréchet mean for sets of graphs which is metric independent. We show that the technique proposed can be used to determine the Fréchet mean when considering the Hamming distance or a distance defined by the difference between the spectra of the adjacency matrices of the graphs. 
    more » « less
  4. Abstract

    To characterize the location (mean, median) of a set of graphs, one needs a notion of centrality that has been adapted to metric spaces. A standard approach is to consider the Fréchet mean. In practice, computing the Fréchet mean for sets of large graphs presents many computational issues. In this work, we suggest a method that may be used to compute the Fréchet mean for sets of graphs which is metric independent. We show that the technique proposed can be used to determine the Fréchet mean when considering the Hamming distance or a distance defined by the difference between the spectra of the adjacency matrices of the graphs.

     
    more » « less
  5. It is common for data structures such as images and shapes of 2D objects to be represented as points on a manifold. The utility of a mechanism to produce sanitized differentially private estimates from such data is intimately linked to how compatible it is with the underlying structure and geometry of the space. In particular, as recently shown, utility of the Laplace mechanism on a positively curved manifold, such as Kendall’s 2D shape space, is significantly influenced by the curvature. Focusing on the problem of sanitizing the Fr\'echet mean of a sample of points on a manifold, we exploit the characterization of the mean as the minimizer of an objective function comprised of the sum of squared distances and develop a K-norm gradient mechanism on Riemannian manifolds that favors values that produce gradients close to the the zero of the objective function. For the case of positively curved manifolds, we describe how using the gradient of the squared distance function offers better control over sensitivity than the Laplace mechanism, and demonstrate this numerically on a dataset of shapes of corpus callosa. Further illustrations of the mechanism’s utility on a sphere and the manifold of symmetric positive definite matrices are also presented. 
    more » « less