skip to main content


Title: An O(n5/4) Time ∊-Approximation Algorithm for RMS Matching in a Plane
The 2-Wasserstein distance (or RMS distance) is a useful measure of similarity between probability distributions with exciting applications in machine learning. For discrete distributions, the problem of computing this distance can be expressed in terms of finding a minimum-cost perfect matching on a complete bipartite graph given by two multisets of points A, B ⊂ ℝ2, with |A| = |B| = n, where the ground distance between any two points is the squared Euclidean distance between them. Although there is a near-linear time relative ∊-approximation algorithm for the case where the ground distance is Euclidean (Sharathkumar and Agarwal, JACM 2020), all existing relative ∊-approximation algorithms for the RMS distance take Ω(n3/2) time. This is primarily because, unlike Euclidean distance, squared Euclidean distance is not a metric. In this paper, for the RMS distance, we present a new ∊-approximation algorithm that runs in O(n^5/4 poly{log n, 1/∊}) time. Our algorithm is inspired by a recent approach for finding a minimum-cost perfect matching in bipartite planar graphs (Asathulla et al, TALG 2020). Their algorithm depends heavily on the existence of sublinear sized vertex separators as well as shortest path data structures that require planarity. Surprisingly, we are able to design a similar algorithm for a complete geometric graph that is far from planar and does not have any vertex separators. Central components of our algorithm include a quadtree-based distance that approximates the squared Euclidean distance and a data structure that supports both Hungarian search and augmentation in sublinear time.  more » « less
Award ID(s):
1909171
NSF-PAR ID:
10232551
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proceedings of the Annual ACMSIAM Symposium on Discrete Algorithms
ISSN:
1557-9468
Page Range / eLocation ID:
869 - 888
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Given a weighted planar bipartite graph G ( A ∪ B , E ) where each edge has an integer edge cost, we give an Õ( n 4/3 log nC ) time algorithm to compute minimum-cost perfect matching; here C is the maximum edge cost in the graph. The previous best-known planarity exploiting algorithm has a running time of O ( n 3/2 log n ) and is achieved by using planar separators (Lipton and Tarjan ’80). Our algorithm is based on the bit-scaling paradigm (Gabow and Tarjan ’89). For each scale, our algorithm first executes O ( n 1/3 ) iterations of Gabow and Tarjan’s algorithm in O ( n 4/3 ) time leaving only O ( n 2/3 ) vertices unmatched. Next, it constructs a compressed residual graph H with O ( n 2/3 ) vertices and O ( n ) edges. This is achieved by using an r -division of the planar graph G with r = n 2/3 . For each partition of the r -division, there is an edge between two vertices of H if and only if they are connected by a directed path inside the partition. Using existing efficient shortest-path data structures, the remaining O ( n 2/3 ) vertices are matched by iteratively computing a minimum-cost augmenting path, each taking Õ( n 2/3 ) time. Augmentation changes the residual graph, so the algorithm updates the compressed representation for each partition affected by the change in Õ( n 2/3 ) time. We bound the total number of affected partitions over all the augmenting paths by O ( n 2/3 log n ). Therefore, the total time taken by the algorithm is Õ( n 4/3 ). 
    more » « less
  2. We consider the problem of designing sublinear time algorithms for estimating the cost of minimum] metric traveling salesman (TSP) tour. Specifically, given access to a n × n distance matrix D that specifies pairwise distances between n points, the goal is to estimate the TSP cost by performing only sublinear (in the size of D) queries. For the closely related problem of estimating the weight of a metric minimum spanning tree (MST), it is known that for any epsilon > 0, there exists an O^~(n/epsilon^O(1))-time algorithm that returns a (1+epsilon)-approximate estimate of the MST cost. This result immediately implies an O^~(n/epsilon^O(1)) time algorithm to estimate the TSP cost to within a (2 + epsilon) factor for any epsilon > 0. However, no o(n^2)-time algorithms are known to approximate metric TSP to a factor that is strictly better than 2. On the other hand, there were also no known barriers that rule out existence of (1 + epsilon)-approximate estimation algorithms for metric TSP with O^~ (n) time for any fixed epsilon > 0. In this paper, we make progress on both algorithms and lower bounds for estimating metric TSP cost. On the algorithmic side, we first consider the graphic TSP problem where the metric D corresponds to shortest path distances in a connected unweighted undirected graph. We show that there exists an O^~(n) time algorithm that estimates the cost of graphic TSP to within a factor of (2 − epsilon_0) for some epsilon_0 > 0. This is the first sublinear cost estimation algorithm for graphic TSP that achieves an approximation factor less than 2. We also consider another well-studied special case of metric TSP, namely, (1, 2)-TSP where all distances are either 1 or 2, and give an O^~(n ^ 1.5) time algorithm to estimate optimal cost to within a factor of 1.625. Our estimation algorithms for graphic TSP as well as for (1, 2)-TSP naturally lend themselves to O^~(n) space streaming algorithms that give an 11/6-approximation for graphic TSP and a 1.625-approximation for (1, 2)-TSP. These results motivate the natural question if analogously to metric MST, for any epsilon > 0, (1 + epsilon)-approximate estimates can be obtained for graphic TSP and (1, 2)-TSP using O^~ (n) queries. We answer this question in the negative – there exists an epsilon_0 > 0, such that any algorithm that estimates the cost of graphic TSP ((1, 2)-TSP) to within a (1 + epsilon_0)-factor, necessarily requires (n^2) queries. This lower bound result highlights a sharp separation between the metric MST and metric TSP problems. Similarly to many classical approximation algorithms for TSP, our sublinear time estimation algorithms utilize subroutines for estimating the size of a maximum matching in the underlying graph. We show that this is not merely an artifact of our approach, and that for any epsilon > 0, any algorithm that estimates the cost of graphic TSP or (1, 2)-TSP to within a (1 + epsilon)-factor, can also be used to estimate the size of a maximum matching in a bipartite graph to within an epsilon n additive error. This connection allows us to translate known lower bounds for matching size estimation in various models to similar lower bounds for metric TSP cost estimation. 
    more » « less
  3. null (Ed.)
    We present an m^{4/3+o(1)} log W -time algorithm for solving the minimum cost flow problem in graphs with unit capacity, where W is the maximum absolute value of any edge weight. For sparse graphs, this improves over the best known running time for this problem and, by well-known reductions, also implies improved running times for the shortest path problem with negative weights, minimum cost bipartite b-matching when |b|_1 = O(m), and recovers the running time of the currently fastest algorithm for maximum flow in graphs with unit capacities (Liu-Sidford, 2020). Our algorithm relies on developing an interior point method–based framework which acts on the space of circulations in the underlying graph. From the combinatorial point of view, this framework can be viewed as iteratively improving the cost of a suboptimal solution by pushing flow around circulations. These circulations are derived by computing a regularized version of the standard Newton step, which is partially inspired by previous work on the unit-capacity maximum flow problem (Liu-Sidford, 2019), and subsequently refined based on the very recent progress on this problem (Liu-Sidford, 2020). The resulting step problem can then be computed efficiently using the recent work on l_p-norm minimizing flows (Kyng-Peng-Sachdeva-Wang, 2019). We obtain our faster algorithm by combining this new step primitive with a customized preconditioning method, which aims to ensure that the graph on which these circulations are computed has sufficiently large conductance. 
    more » « less
  4. null (Ed.)
    We present an $m^{4/3}+o(1) \log W$ -time algorithm for solving the minimum cost flow problem in graphs with unit capacity, where W is the maximum absolute value of any edge weight. For sparse graphs, this improves over the best known running time for this problem and, by well-known reductions, also implies improved running times for the shortest path problem with negative weights, minimum cost bipartite $b$-matching when $\|b\|_1 = O(m)$, and recovers the running time of the currently fastest algorithm for maximum flow in graphs with unit capacities (Liu-Sidford, 2020). Our algorithm relies on developing an interior point method–based framework which acts on the space of circulations in the underlying graph. From the combinatorial point of view, this framework can be viewed as iteratively improving the cost of a suboptimal solution by pushing flow around circulations. These circulations are derived by computing a regularized version of the standard Newton step, which is partially inspired by previous work on the unit-capacity maximum flow problem (Liu-Sidford, 2019), and subsequently refined based on the very re- cent progress on this problem (Liu-Sidford, 2020). The resulting step problem can then be computed efficiently using the recent work on $l_p$-norm minimizing flows (Kyng-Peng-Sachdeva- Wang, 2019). We obtain our faster algorithm by combining this new step primitive with a customized preconditioning method, which aims to ensure that the graph on which these circulations are computed has sufficiently large conductance. 
    more » « less
  5. Mikołaj Boja´nczyk, Emanuela Merelli (Ed.)
    We initiate a systematic study of algorithms that are both differentially-private and run in sublinear time for several problems in which the goal is to estimate natural graph parameters. Our main result is a differentially-private $(1+\rho)$-approximation algorithm for the problem of computing the average degree of a graph, for every $\rho>0$. The running time of the algorithm is roughly the same (for sparse graphs) as its non-private version proposed by Goldreich and Ron (Sublinear Algorithms, 2005). We also obtain the first differentially-private sublinear-time approximation algorithms for the maximum matching size and the minimum vertex cover size of a graph. An overarching technique we employ is the notion of \emph{coupled global sensitivity} of randomized algorithms. Related variants of this notion of sensitivity have been used in the literature in ad-hoc ways. Here we formalize the notion and develop it as a unifying framework for privacy analysis of randomized approximation algorithms. 
    more » « less