skip to main content


Title: Distance distributions and inverse problems for metric measure spaces
Abstract

Applications in data science, shape analysis, and object classification frequently require comparison of probability distributions defined on different ambient spaces. To accomplish this, one requires a notion of distance on a given class of metric measure spaces—that is, compact metric spaces endowed with probability measures. Such distances are typically defined as comparisons between metric measure space invariants, such as distance distributions (also referred to as shape distributions, distance histograms, or shape contexts in the literature). Generally, distances defined in terms of distance distributions are actually pseudometrics, in that they may vanish when comparing nonisomorphic spaces. The goal of this paper is to set up a formal framework for assessing the discrimininative power of distance distributions, that is, the extent to which these pseudometrics fail to define proper metrics. We formulate several precise inverse problems in terms of these invariants and answer them in several categories of metric measure spaces, including the category of plane curves, where we give a counterexample to the curve histogram conjecture of Brinkman and Olver, the categories of embedded and Riemannian manifolds, where we obtain sphere rigidity results, and the category of metric graphs, where we obtain a local injectivity result along the lines of classical work of Boutin and Kemper on point cloud configurations. The inverse problems are further contextualized by the introduction of a variant of the Gromov–Wasserstein distance on the space of metric measure spaces, which is inspired by the original Monge formulation of optimal transport.

 
more » « less
NSF-PAR ID:
10444384
Author(s) / Creator(s):
 ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Studies in Applied Mathematics
Volume:
149
Issue:
4
ISSN:
0022-2526
Page Range / eLocation ID:
p. 943-1001
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    One-dimensional persistent homology is arguably the most important and heavily used computational tool in topological data analysis. Additional information can be extracted from datasets by studying multi-dimensional persistence modules and by utilizing cohomological ideas, e.g. the cohomological cup product. In this work, given a single parameter filtration, we investigate a certain 2-dimensional persistence module structure associated with persistent cohomology, where one parameter is the cup-length$$\ell \ge 0$$0and the other is the filtration parameter. This new persistence structure, called thepersistent cup module, is induced by the cohomological cup product and adapted to the persistence setting. Furthermore, we show that this persistence structure is stable. By fixing the cup-length parameter$$\ell $$, we obtain a 1-dimensional persistence module, called the persistent$$\ell $$-cup module, and again show it is stable in the interleaving distance sense, and study their associated generalized persistence diagrams. In addition, we consider a generalized notion of apersistent invariant, which extends both therank invariant(also referred to aspersistent Betti number), Puuska’s rank invariant induced by epi-mono-preserving invariants of abelian categories, and the recently-definedpersistent cup-length invariant, and we establish their stability. This generalized notion of persistent invariant also enables us to lift the Lyusternik-Schnirelmann (LS) category of topological spaces to a novel stable persistent invariant of filtrations, called thepersistent LS-category invariant.

     
    more » « less
  2. We study the persistent homology of both functional data on compact topological spaces and structural data presented as compact metric measure spaces. One of our goals is to define persistent homology so as to capture primarily properties of the shape of a signal, eliminating otherwise highly persistent homology classes that may exist simply because of the nature of the domain on which the signal is defined. We investigate the stability of these invariants using metrics that downplay regions where signals are weak. The distance between two signals is small if they exhibit high similarity in regions where they are strong, regardless of the nature of their full domains, in particular allowing different homotopy types. Consistency and estimation of persistent homology of metric measure spaces from data are studied within this framework. We also apply the methodology to the construction of multi-scale topological descriptors for data on compact Riemannian manifolds via metric relaxations derived from the heat kernel. 
    more » « less
  3. The Gromov-Wasserstein (GW) formalism can be seen as a generalization of the optimal transport (OT) formalism for comparing two distributions associated with different metric spaces. It is a quadratic optimization problem and solving it usually has computational costs that can rise sharply if the problem size exceeds a few hundred points. Recently fast techniques based on entropy regularization have being developed to solve an approximation of the GW problem quickly. There are issues, however, with the numerical convergence of those regularized approximations to the true GW solution. To circumvent those issues, we introduce a novel strategy to solve the discrete GW problem using methods taken from statistical physics. We build a temperature-dependent free energy function that reflects the GW problem’s constraints. To account for possible differences of scales between the two metric spaces, we introduce a scaling factor s in the definition of the energy. From the extremum of the free energy, we derive a mapping between the two probability measures that are being compared, as well as a distance between those measures. This distance is equal to the GW distance when the temperature goes to zero. The optimal scaling factor itself is obtained by minimizing the free energy with respect to s. We illustrate our approach on the problem of comparing shapes defined by unstructured triangulations of their surfaces. We use several synthetic and “real life” datasets. We demonstrate the accuracy and automaticity of our approach in non-rigid registration of shapes. We provide numerical evidence that there is a strong correlation between the GW distances computed from low-resolution, surface-based representations of proteins and the analogous distances computed from atomistic models of the same proteins. 
    more » « less
  4. There has been a long-standing interest in computing diverse solutions to optimization problems. In 1995 J. Krarup [28] posed the problem of finding k-edge disjoint Hamiltonian Circuits of minimum total weight, called the peripatetic salesman problem (PSP). Since then researchers have investigated the complexity of finding diverse solutions to spanning trees, paths, vertex covers, matchings, and more. Unlike the PSP that has a constraint on the total weight of the solutions, recent work has involved finding diverse solutions that are all optimal. However, sometimes the space of exact solutions may be too small to achieve sufficient diversity. Motivated by this, we initiate the study of obtaining sufficiently-diverse, yet approximately-optimal solutions to optimization problems. Formally, given an integer k, an approximation factor c, and an instance I of an optimization problem, we aim to obtain a set of k solutions to I that a) are all c approximately-optimal for I and b) maximize the diversity of the k solutions. Finding such solutions, therefore, requires a better understanding of the global landscape of the optimization function. Given a metric on the space of solutions, and the diversity measure as the sum of pairwise distances between solutions, we first provide a general reduction to an associated budget-constrained optimization (BCO) problem, where one objective function is to optimized subject to a bound on the second objective function. We then prove that bi-approximations to the BCO can be used to give bi-approximations to the diverse approximately optimal solutions problem. As applications of our result, we present polynomial time approximation algorithms for several problems such as diverse c-approximate maximum matchings, shortest paths, global min-cut, and minimum weight bases of a matroid. The last result gives us diverse c-approximate minimum spanning trees, advancing a step towards achieving diverse c-approximate TSP tours. We also explore the connection to the field of multiobjective optimization and show that the class of problems to which our result applies includes those for which the associated DUALRESTRICT problem defined by Papadimitriou and Yannakakis [35], and recently explored by Herzel et al. [26] can be solved in polynomial ti 
    more » « less
  5. There has been a long-standing interest in computing diverse solutions to optimization problems. In 1995 J. Krarup [28] posed the problem of finding k-edge disjoint Hamiltonian Circuits of minimum total weight, called the peripatetic salesman problem (PSP). Since then researchers have investigated the complexity of finding diverse solutions to spanning trees, paths, vertex covers, matchings, and more. Unlike the PSP that has a constraint on the total weight of the solutions, recent work has involved finding diverse solutions that are all optimal. However, sometimes the space of exact solutions may be too small to achieve sufficient diversity. Motivated by this, we initiate the study of obtaining sufficiently-diverse, yet approximately-optimal solutions to optimization problems. Formally, given an integer k, an approximation factor c, and an instance I of an optimization problem, we aim to obtain a set of k solutions to I that a) are all c approximately-optimal for I and b) maximize the diversity of the k solutions. Finding such solutions, therefore, requires a better understanding of the global landscape of the optimization function. Given a metric on the space of solutions, and the diversity measure as the sum of pairwise distances between solutions, we first provide a general reduction to an associated budget-constrained optimization (BCO) problem, where one objective function is to optimized subject to a bound on the second objective function. We then prove that bi-approximations to the BCO can be used to give bi-approximations to the diverse approximately optimal solutions problem. As applications of our result, we present polynomial time approximation algorithms for several problems such as diverse c-approximate maximum matchings, shortest paths, global min-cut, and minimum weight bases of a matroid. The last result gives us diversec-approximate minimum spanning trees, advancing a step towards achieving diverse c-approximate TSP tours. We also explore the connection to the field of multiobjective optimization and show that the class of problems to which our result applies includes those for which the associated DUALRESTRICT problem defined by Papadimitriou and Yannakakis [35], and recently explored by Herzel et al. [26] can be solved in polynomial time. 
    more » « less