skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Composition of nested embeddings with an application to outlier removal
We study the design of embeddings into Euclidean space with outliers. Given a metric space (X, d) and aninteger k, the goal is to embed all but k points in X (called the “outliers”) into ℓ2 with the smallest possibledistortion c. Finding the optimal distortion c for a given outlier set size k, or alternately the smallest k fora given target distortion c are both NP-hard problems. In fact, it is UGC-hard to approximate k to withina factor smaller than 2 even when the metric sans outliers is isometrically embeddable into ℓ2. We considerbi-criteria approximations. Our main result is a polynomial time algorithm that approximates the outlier setsize to within an O(log2 k) factor and the distortion to within a constant factor.The main technical component in our result is an approach for constructing Lipschitz extensions ofembeddings into Banach spaces (such as ℓp spaces). We consider a stronger version of Lipschitz extensionthat we call a nested composition of embeddings: given a low distortion embedding of a subset S of the metricspace X, our goal is to extend this embedding to all of X such that the distortion over S is preserved, whereasthe distortion over the remaining pairs of points in X is bounded by a function of the size of X \ S. Priorwork on Lipschitz extension considers settings where the size of X is potentially much larger than that of Sand the expansion bounds depend on |S|. In our setting, the set S is nearly all of X and the remaining setX \ S, a.k.a. the outliers, is small. We achieve an expansion bound that is polylogarithmic in |X \ S|.  more » « less
Award ID(s):
2217069
PAR ID:
10521373
Author(s) / Creator(s):
;
Publisher / Repository:
SIAM
Date Published:
ISBN:
978-1-61197-791-2
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We introduce and study the notion of *an outer bi-Lipschitz extension* of a map between Euclidean spaces. The notion is a natural analogue of the notion of *a Lipschitz extension* of a Lipschitz map. We show that for every map f there exists an outer bi-Lipschitz extension f′ whose distortion is greater than that of f by at most a constant factor. This result can be seen as a counterpart of the classic Kirszbraun theorem for outer bi-Lipschitz extensions. We also study outer bi-Lipschitz extensions of near-isometric maps and show upper and lower bounds for them. Then, we present applications of our results to prioritized and terminal dimension reduction problems, described next. We prove a *prioritized* variant of the Johnson–Lindenstrauss lemma: given a set of points X⊂ ℝd of size N and a permutation (”priority ranking”) of X, there exists an embedding f of X into ℝO(logN) with distortion O(loglogN) such that the point of rank j has only O(log3 + ε j) non-zero coordinates – more specifically, all but the first O(log3+ε j) coordinates are equal to 0; the distortion of f restricted to the first j points (according to the ranking) is at most O(loglogj). The result makes a progress towards answering an open question by Elkin, Filtser, and Neiman about prioritized dimension reductions. We prove that given a set X of N points in ℜd, there exists a *terminal* dimension reduction embedding of ℝd into ℝd′, where d′ = O(logN/ε4), which preserves distances ||x−y|| between points x∈ X and y ∈ ℝd, up to a multiplicative factor of 1 ± ε. This improves a recent result by Elkin, Filtser, and Neiman. The dimension reductions that we obtain are nonlinear, and this nonlinearity is necessary. 
    more » « less
  2. Mulzer, Wolfgang; Phillips, Jeff M (Ed.)
    Metric spaces (X, d) are ubiquitous objects in mathematics and computer science that allow for capturing pairwise distance relationships d(x, y) between points x, y ∈ X. Because of this, it is natural to ask what useful generalizations there are of metric spaces for capturing "k-wise distance relationships" d(x_1, …, x_k) among points x_1, …, x_k ∈ X for k > 2. To that end, Gähler (Math. Nachr., 1963) (and perhaps others even earlier) defined k-metric spaces, which generalize metric spaces, and most notably generalize the triangle inequality d(x₁, x₂) ≤ d(x₁, y) + d(y, x₂) to the "simplex inequality" d(x_1, …, x_k) ≤ ∑_{i=1}^k d(x_1, …, x_{i-1}, y, x_{i+1}, …, x_k). (The definition holds for any fixed k ≥ 2, and a 2-metric space is just a (standard) metric space.) In this work, we introduce strong k-metric spaces, k-metric spaces that satisfy a topological condition stronger than the simplex inequality, which makes them "behave nicely." We also introduce coboundary k-metrics, which generalize 𝓁_p metrics (and in fact all finite metric spaces induced by norms) and minimum bounding chain k-metrics, which generalize shortest path metrics (and capture all strong k-metrics). Using these definitions, we prove analogs of a number of fundamental results about embedding finite metric spaces including Fréchet embedding (isometric embedding into 𝓁_∞) and isometric embedding of all tree metrics into 𝓁₁. We also study relationships between families of (strong) k-metrics, and show that natural quantities, like simplex volume, are strong k-metrics. 
    more » « less
  3. In this paper, we generalize a bi-Lipschitz extension result of David and Semmes from Euclidean spaces to complete metric measure spaces with controlled geometry (Ahlfors regularity and supporting a Poincaré inequality). In particular, we find sharp conditions on metric measure spaces X so that any bi-Lipschitz embedding of a subset of the real line into X extends to a bi-Lipschitz embedding of the whole line. Along the way, we prove that if the complement of an open subset Y of X has small Assouad dimension, then it is a uniform domain. Finally, we prove a quantitative approximation of continua in X by bi-Lipschitz curves. 
    more » « less
  4. Abstract Given a Banach space X and a real number α ≥ 1, we write: (1) D ( X ) ≤ α if, for any locally finite metric space A , all finite subsets of which admit bilipschitz embeddings into X with distortions ≤ C , the space A itself admits a bilipschitz embedding into X with distortion ≤ α ⋅ C ; (2) D ( X ) = α + if, for every ϵ > 0, the condition D ( X ) ≤ α + ϵ holds, while D ( X ) ≤ α does not; (3) D ( X ) ≤ α + if D ( X ) = α + or D ( X ) ≤ α. It is known that D ( X ) is bounded by a universal constant, but the available estimates for this constant are rather large. The following results have been proved in this work: (1) D ((⊕ n =1 ∞ X n ) p ) ≤ 1 + for every nested family of finite-dimensional Banach spaces { X n } n =1 ∞ and every 1 ≤ p ≤ ∞. (2) D ((⊕ n =1 ∞ ℓ ∞ n ) p ) = 1 + for 1 < p < ∞. (3) D ( X ) ≤ 4 + for every Banach space X with no nontrivial cotype. Statement (3) is a strengthening of the Baudier–Lancien result (2008). 
    more » « less
  5. In this paper we present new proofs of the non-embeddability of countably branching trees into Banach spaces satisfying property beta_p and of countably branching diamonds into Banach spaces which are l_p-asymptotic midpoint uniformly convex (p-AMUC) for p>1. These proofs are entirely metric in nature and are inspired by previous work of Jiří Matoušek. In addition, using this metric method, we succeed in extending these results to metric spaces satisfying certain embedding obstruction inequalities. Finally, we give Tessera-type lower bounds on the compression for a class of Lipschitz embeddings of the countably branching trees into Banach spaces containing l_p-asymptotic models for p>=1. 
    more » « less