- Award ID(s):
- 1916071
- NSF-PAR ID:
- 10286568
- Editor(s):
- Banerjee, A.; Fukumizu, K.
- Date Published:
- Journal Name:
- Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS) 2021
- Volume:
- 30
- Page Range / eLocation ID:
- 2485 - 2493
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Given a metric space ℳ = (X,δ), a weighted graph G over X is a metric t-spanner of ℳ if for every u,v ∈ X, δ(u,v) ≤ δ_G(u,v) ≤ t⋅ δ(u,v), where δ_G is the shortest path metric in G. In this paper, we construct spanners for finite sets in metric spaces in the online setting. Here, we are given a sequence of points (s₁, …, s_n), where the points are presented one at a time (i.e., after i steps, we have seen S_i = {s₁, … , s_i}). The algorithm is allowed to add edges to the spanner when a new point arrives, however, it is not allowed to remove any edge from the spanner. The goal is to maintain a t-spanner G_i for S_i for all i, while minimizing the number of edges, and their total weight. Under the L₂-norm in ℝ^d for arbitrary constant d ∈ ℕ, we present an online (1+ε)-spanner algorithm with competitive ratio O_d(ε^{-d} log n), improving the previous bound of O_d(ε^{-(d+1)}log n). Moreover, the spanner maintained by the algorithm has O_d(ε^{1-d}log ε^{-1})⋅ n edges, almost matching the (offline) optimal bound of O_d(ε^{1-d})⋅ n. In the plane, a tighter analysis of the same algorithm provides an almost quadratic improvement of the competitive ratio to O(ε^{-3/2}logε^{-1}log n), by comparing the online spanner with an instance-optimal spanner directly, bypassing the comparison to an MST (i.e., lightness). As a counterpart, we design a sequence of points that yields a Ω_d(ε^{-d}) lower bound for the competitive ratio for online (1+ε)-spanner algorithms in ℝ^d under the L₁-norm. Then we turn our attention to online spanners in general metrics. Note that, it is not possible to obtain a spanner with stretch less than 3 with a subquadratic number of edges, even in the offline setting, for general metrics. We analyze an online version of the celebrated greedy spanner algorithm, dubbed ordered greedy. With stretch factor t = (2k-1)(1+ε) for k ≥ 2 and ε ∈ (0,1), we show that it maintains a spanner with O(ε^{-1}logε^{-1})⋅ n^{1+1/k} edges and O(ε^{-1}n^{1/k}log² n) lightness for a sequence of n points in a metric space. We show that these bounds cannot be significantly improved, by introducing an instance that achieves an Ω(1/k⋅ n^{1/k}) competitive ratio on both sparsity and lightness. Furthermore, we establish the trade-off among stretch, number of edges and lightness for points in ultrametrics, showing that one can maintain a (2+ε)-spanner for ultrametrics with O(ε^{-1}logε^{-1})⋅ n edges and O(ε^{-2}) lightness.more » « less
-
Given a metric space ℳ = (X,δ), a weighted graph G over X is a metric t-spanner of ℳ if for every u,v ∈ X, δ(u,v) ≤ δ_G(u,v) ≤ t⋅ δ(u,v), where δ_G is the shortest path metric in G. In this paper, we construct spanners for finite sets in metric spaces in the online setting. Here, we are given a sequence of points (s₁, …, s_n), where the points are presented one at a time (i.e., after i steps, we have seen S_i = {s₁, … , s_i}). The algorithm is allowed to add edges to the spanner when a new point arrives, however, it is not allowed to remove any edge from the spanner. The goal is to maintain a t-spanner G_i for S_i for all i, while minimizing the number of edges, and their total weight. Under the L₂-norm in ℝ^d for arbitrary constant d ∈ ℕ, we present an online (1+ε)-spanner algorithm with competitive ratio O_d(ε^{-d} log n), improving the previous bound of O_d(ε^{-(d+1)}log n). Moreover, the spanner maintained by the algorithm has O_d(ε^{1-d}log ε^{-1})⋅ n edges, almost matching the (offline) optimal bound of O_d(ε^{1-d})⋅ n. In the plane, a tighter analysis of the same algorithm provides an almost quadratic improvement of the competitive ratio to O(ε^{-3/2}logε^{-1}log n), by comparing the online spanner with an instance-optimal spanner directly, bypassing the comparison to an MST (i.e., lightness). As a counterpart, we design a sequence of points that yields a Ω_d(ε^{-d}) lower bound for the competitive ratio for online (1+ε)-spanner algorithms in ℝ^d under the L₁-norm. Then we turn our attention to online spanners in general metrics. Note that, it is not possible to obtain a spanner with stretch less than 3 with a subquadratic number of edges, even in the offline setting, for general metrics. We analyze an online version of the celebrated greedy spanner algorithm, dubbed ordered greedy. With stretch factor t = (2k-1)(1+ε) for k ≥ 2 and ε ∈ (0,1), we show that it maintains a spanner with O(ε^{-1}logε^{-1})⋅ n^{1+1/k} edges and O(ε^{-1}n^{1/k}log² n) lightness for a sequence of n points in a metric space. We show that these bounds cannot be significantly improved, by introducing an instance that achieves an Ω(1/k⋅ n^{1/k}) competitive ratio on both sparsity and lightness. Furthermore, we establish the trade-off among stretch, number of edges and lightness for points in ultrametrics, showing that one can maintain a (2+ε)-spanner for ultrametrics with O(ε^{-1}logε^{-1})⋅ n edges and O(ε^{-2}) lightness.more » « less
-
Evaluating AMR parsing accuracy involves comparing pairs of AMR graphs. The major evaluation metric, SMATCH (Cai and Knight,2013), searches for one-to-one mappings between the nodes of two AMRs with a greedy hill-climbing algorithm, which leads to search errors. We propose SEMBLEU, a robust metric that extends BLEU (Papineni et al., 2002) to AMRs. It does not suffer from search errors and considers non-local correspondences in addition to local ones. SEMBLEU is fully content-driven and punishes situations where a system’s output does not preserve most information from the input. Preliminary experiments on both sentence and corpus levels show that SEMBLEU has slightly higher consistency with human judgments than SMATCH. Our code is available athttp://github.com/freesunshine0316/sembleu.more » « less
-
Abstract Applications in data science, shape analysis, and object classification frequently require comparison of probability distributions defined on different ambient spaces. To accomplish this, one requires a notion of distance on a given class of metric measure spaces—that is, compact metric spaces endowed with probability measures. Such distances are typically defined as comparisons between metric measure space invariants, such as distance distributions (also referred to as shape distributions, distance histograms, or shape contexts in the literature). Generally, distances defined in terms of distance distributions are actually pseudometrics, in that they may vanish when comparing nonisomorphic spaces. The goal of this paper is to set up a formal framework for assessing the discrimininative power of distance distributions, that is, the extent to which these pseudometrics fail to define proper metrics. We formulate several precise inverse problems in terms of these invariants and answer them in several categories of metric measure spaces, including the category of plane curves, where we give a counterexample to the curve histogram conjecture of Brinkman and Olver, the categories of embedded and Riemannian manifolds, where we obtain sphere rigidity results, and the category of metric graphs, where we obtain a local injectivity result along the lines of classical work of Boutin and Kemper on point cloud configurations. The inverse problems are further contextualized by the introduction of a variant of the Gromov–Wasserstein distance on the space of metric measure spaces, which is inspired by the original Monge formulation of optimal transport.
-
We study the persistent homology of both functional data on compact topological spaces and structural data presented as compact metric measure spaces. One of our goals is to define persistent homology so as to capture primarily properties of the shape of a signal, eliminating otherwise highly persistent homology classes that may exist simply because of the nature of the domain on which the signal is defined. We investigate the stability of these invariants using metrics that downplay regions where signals are weak. The distance between two signals is small if they exhibit high similarity in regions where they are strong, regardless of the nature of their full domains, in particular allowing different homotopy types. Consistency and estimation of persistent homology of metric measure spaces from data are studied within this framework. We also apply the methodology to the construction of multi-scale topological descriptors for data on compact Riemannian manifolds via metric relaxations derived from the heat kernel.more » « less