skip to main content


Title: Computing the Gromov-Wasserstein Distance between Two Surface Meshes Using Optimal Transport
The Gromov-Wasserstein (GW) formalism can be seen as a generalization of the optimal transport (OT) formalism for comparing two distributions associated with different metric spaces. It is a quadratic optimization problem and solving it usually has computational costs that can rise sharply if the problem size exceeds a few hundred points. Recently fast techniques based on entropy regularization have being developed to solve an approximation of the GW problem quickly. There are issues, however, with the numerical convergence of those regularized approximations to the true GW solution. To circumvent those issues, we introduce a novel strategy to solve the discrete GW problem using methods taken from statistical physics. We build a temperature-dependent free energy function that reflects the GW problem’s constraints. To account for possible differences of scales between the two metric spaces, we introduce a scaling factor s in the definition of the energy. From the extremum of the free energy, we derive a mapping between the two probability measures that are being compared, as well as a distance between those measures. This distance is equal to the GW distance when the temperature goes to zero. The optimal scaling factor itself is obtained by minimizing the free energy with respect to s. We illustrate our approach on the problem of comparing shapes defined by unstructured triangulations of their surfaces. We use several synthetic and “real life” datasets. We demonstrate the accuracy and automaticity of our approach in non-rigid registration of shapes. We provide numerical evidence that there is a strong correlation between the GW distances computed from low-resolution, surface-based representations of proteins and the analogous distances computed from atomistic models of the same proteins.  more » « less
Award ID(s):
1760485
NSF-PAR ID:
10465354
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Algorithms
Volume:
16
Issue:
3
ISSN:
1999-4893
Page Range / eLocation ID:
131
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Applications in data science, shape analysis, and object classification frequently require comparison of probability distributions defined on different ambient spaces. To accomplish this, one requires a notion of distance on a given class of metric measure spaces—that is, compact metric spaces endowed with probability measures. Such distances are typically defined as comparisons between metric measure space invariants, such as distance distributions (also referred to as shape distributions, distance histograms, or shape contexts in the literature). Generally, distances defined in terms of distance distributions are actually pseudometrics, in that they may vanish when comparing nonisomorphic spaces. The goal of this paper is to set up a formal framework for assessing the discrimininative power of distance distributions, that is, the extent to which these pseudometrics fail to define proper metrics. We formulate several precise inverse problems in terms of these invariants and answer them in several categories of metric measure spaces, including the category of plane curves, where we give a counterexample to the curve histogram conjecture of Brinkman and Olver, the categories of embedded and Riemannian manifolds, where we obtain sphere rigidity results, and the category of metric graphs, where we obtain a local injectivity result along the lines of classical work of Boutin and Kemper on point cloud configurations. The inverse problems are further contextualized by the introduction of a variant of the Gromov–Wasserstein distance on the space of metric measure spaces, which is inspired by the original Monge formulation of optimal transport.

     
    more » « less
  2. Oliva, Gabriele (Ed.)
    We define a new family of similarity and distance measures on graphs, and explore their theoretical properties in comparison to conventional distance metrics. These measures are defined by the solution(s) to an optimization problem which attempts find a map minimizing the discrepancy between two graph Laplacian exponential matrices, under norm-preserving and sparsity constraints. Variants of the distance metric are introduced to consider such optimized maps under sparsity constraints as well as fixed time-scaling between the two Laplacians. The objective function of this optimization is multimodal and has discontinuous slope, and is hence difficult for univariate optimizers to solve. We demonstrate a novel procedure for efficiently calculating these optima for two of our distance measure variants. We present numerical experiments demonstrating that (a) upper bounds of our distance metrics can be used to distinguish between lineages of related graphs; (b) our procedure is faster at finding the required optima, by as much as a factor of 10 3 ; and (c) the upper bounds satisfy the triangle inequality exactly under some assumptions and approximately under others. We also derive an upper bound for the distance between two graph products, in terms of the distance between the two pairs of factors. Additionally, we present several possible applications, including the construction of infinite “graph limits” by means of Cauchy sequences of graphs related to one another by our distance measure. 
    more » « less
  3. null (Ed.)
    Self-assembly of proteins on lipid membranes underlies many important processes in cell biology, such as, exo- and endo-cytosis, assembly of viruses, etc. An attractive force that can cause self-assembly is mediated by membrane thickness interactions between proteins. The free energy profile associated with this attractive force is a result of the overlap of thickness deformation fields around the proteins which can be calculated from the solution of a boundary value problem. Yet, the time scales over which two inclusions coalesce has not been explored, even though the evolution of particle concentrations on membranes has been modeled using phase-field approaches. In this paper we compute this time scale as a function of the initial distance between two inclusions by viewing their coalescence as a first passage time problem. The mean first passage time is computed using Langevin dynamics and a partial differential equation, and both methods are found to be in excellent agreement. Inclusions of three different shapes are studied and it is found that for two inclusions separated by about hundred nanometers the time to coalescence is hundreds of milliseconds irrespective of shape. An efficient computation of the interaction energy of inclusions is central to our work. We compute it using a finite difference technique and show that our results are in excellent agreement with those from a previously proposed semi-analytical method based on Fourier–Bessel series. The computational strategies described in this paper could potentially lead to efficient methods to explore the kinetics of self-assembly of proteins on lipid membranes. 
    more » « less
  4. For unweighted graphs, finding isometric embeddings of a graph G is closely related to decompositions of G into Cartesian products of smaller graphs. When G is isomorphic to a Cartesian graph product, we call the factors of this product a factorization of G. When G is isomorphic to an isometric subgraph of a Cartesian graph product, we call those factors a pseudofactorization of G. Prior work has shown that an unweighted graph’s pseudofactorization can be used to generate a canonical isometric embedding into a product of the smallest possible pseudofactors. However, for arbitrary weighted graphs, which represent a richer variety of metric spaces, methods for finding isometric embeddings or determining their existence remain elusive, and indeed pseudofactorization and factorization have not previously been extended to this context. In this work, we address the problem of finding the factorization and pseudofactorization of a weighted graph G, where G satisfies the property that every edge constitutes a shortest path between its endpoints. We term such graphs minimal graphs, noting that every graph can be made minimal by removing edges not affecting its path metric. We generalize pseudofactorization and factorization to minimal graphs and develop new proof techniques that extend the previously proposed algorithms due to Graham and Winkler [Graham and Winkler, ’85] and Feder [Feder, ’92] for pseudofactorization and factorization of unweighted graphs. We show that any n-vertex, m-edge graph with positive integer edge weights can be factored in O(m2) time, plus the time to find all pairs shortest paths (APSP) distances in a weighted graph, resulting in an overall running time of O(m2+n2 log log n) time. We also show that a pseudofactorization for such a graph can be computed in O(mn) time, plus the time to solve APSP, resulting in an O(mn + n2 log log n) running time. 
    more » « less
  5. Abstract

    An important hypothesis in animal cell biology is that an animal’s acute exercise regimen affects some subcellular structures, for example mitochondrial morphology, in its muscle tissue. This paper investigates that hypothesis using a nonparametric metric-based energy test for comparing mitochondrial populations. It explores two shape spaces—the elastic shape space and Kendall shape space—and five corresponding shape metrics on these spaces. The results overwhelmingly point to the statistical significance of the effect of an acute exercise regimen on the shape of SS-type mitochondria. Although past studies based on specific morphological features derived from mitochondria failed to detect this significance. In this analysis, a potentially significant factor is cell membership and a k-sample generalization of the energy test—the DISCO test shows that the cell effect is indeed significant. The energy test cannot be applied directly due to the hierarchical structure of the distance matrix. We propose a compression method to remove the significant cell effect while testing for the exercise effect. With this compression, only the elastic scaled metric shows statistical significance of the exercise factor in this more complicated scenario. This result is because the elastic-scaled metric is more sensitive to subtle changes in mitochondrial shapes caused by acute exercise.

     
    more » « less