skip to main content

Title: Approximation algorithms in combinatorial scientific computing
We survey recent work on approximation algorithms for computing degree-constrained subgraphs in graphs and their applications in combinatorial scientific computing. The problems we consider include maximization versions of cardinality matching, edge-weighted matching, vertex-weighted matching and edge-weighted $b$ -matching, and minimization versions of weighted edge cover and $b$ -edge cover. Exact algorithms for these problems are impractical for massive graphs with several millions of edges. For each problem we discuss theoretical foundations, the design of several linear or near-linear time approximation algorithms, their implementations on serial and parallel computers, and applications. Our focus is on practical algorithms that yield good performance on modern computer architectures with multiple threads and interconnected processors. We also include information about the software available for these problems.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Acta Numerica
Page Range / eLocation ID:
541 to 633
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We describe a 3/2-approximation algorithm, \lse, for computing a b-edgecover of minimum weight in a graph with weights on the edges. The b-edgecover problem is a generalization of the better-known Edge Cover problem in graphs, where the objective is to choose a subset C of edges in the graph such that at least a specified number b(v) of edges in C are incident on each vertex v. In the weighted b-edgecover problem, we minimize the sum of the weights of the edges in C. We prove that the Locally Subdominant edge (LSE) algorithm computes the same b-edge cover as the one obtained by the Greedy algorithm for the problem. However, the Greedy algorithm requires edges to be sorted by their effective weights, and these weights need to be updated after each iteration. These requirements make the Greedy algorithm sequential and impractical for massive graphs. The LSE algorithm avoids the sorting step, and is amenable for parallelization. We implement the algorithm on a serial machine and compare its performance against a collection of approximation algorithms for the b-edge cover problem. Our results show that the algorithm is 3 to 5 times faster than the Greedy algorithm on a serial processor. The approximate edge covers obtained by the LSE algorithm have weights greater by at most 17% of the optimal weight for problems where we could compute the latter. We also investigate the relationship between the b-edge cover and the b-matching problems, show that the latter has a faster implementation since edge weights are static in this algorithm, and obtain a heuristic solution for the former from the latter. 
    more » « less
  2. Kernel matrices, as well as weighted graphs represented by them, are ubiquitous objects in machine learning, statistics and other related fields. The main drawback of using kernel methods (learning and inference using kernel matrices) is efficiency – given n input points, most kernel-based algorithms need to materialize the full n × n kernel matrix before performing any subsequent computation, thus incurring Ω(n^2) runtime. Breaking this quadratic barrier for various problems has therefore, been a subject of extensive research efforts. We break the quadratic barrier and obtain subquadratic time algorithms for several fundamental linear-algebraic and graph processing primitives, including approximating the top eigenvalue and eigenvector, spectral sparsification, solving lin- ear systems, local clustering, low-rank approximation, arboricity estimation and counting weighted triangles. We build on the recently developed Kernel Density Estimation framework, which (after preprocessing in time subquadratic in n) can return estimates of row/column sums of the kernel matrix. In particular, we de- velop efficient reductions from weighted vertex and weighted edge sampling on kernel graphs, simulating random walks on kernel graphs, and importance sampling on matrices to Kernel Density Estimation and show that we can generate samples from these distributions in sublinear (in the support of the distribution) time. Our reductions are the central ingredient in each of our applications and we believe they may be of independent interest. We empirically demonstrate the efficacy of our algorithms on low-rank approximation (LRA) and spectral sparsi- fication, where we observe a 9x decrease in the number of kernel evaluations over baselines for LRA and a 41x reduction in the graph size for spectral sparsification. 
    more » « less
  3. null (Ed.)
    The 2-Wasserstein distance (or RMS distance) is a useful measure of similarity between probability distributions with exciting applications in machine learning. For discrete distributions, the problem of computing this distance can be expressed in terms of finding a minimum-cost perfect matching on a complete bipartite graph given by two multisets of points A, B ⊂ ℝ2, with |A| = |B| = n, where the ground distance between any two points is the squared Euclidean distance between them. Although there is a near-linear time relative ∊-approximation algorithm for the case where the ground distance is Euclidean (Sharathkumar and Agarwal, JACM 2020), all existing relative ∊-approximation algorithms for the RMS distance take Ω(n3/2) time. This is primarily because, unlike Euclidean distance, squared Euclidean distance is not a metric. In this paper, for the RMS distance, we present a new ∊-approximation algorithm that runs in O(n^5/4 poly{log n, 1/∊}) time. Our algorithm is inspired by a recent approach for finding a minimum-cost perfect matching in bipartite planar graphs (Asathulla et al, TALG 2020). Their algorithm depends heavily on the existence of sublinear sized vertex separators as well as shortest path data structures that require planarity. Surprisingly, we are able to design a similar algorithm for a complete geometric graph that is far from planar and does not have any vertex separators. Central components of our algorithm include a quadtree-based distance that approximates the squared Euclidean distance and a data structure that supports both Hungarian search and augmentation in sublinear time. 
    more » « less
  4. In this paper, we generalize the recently studied stochastic matching problem to more accurately model a significant medical process, kidney exchange, and several other applications. Up until now the stochastic matching problem that has been studied was as follows: given a graph G= (V,E), each edge is included in the realized sub-graph of G independently with probability pe, and the goal is to find a degree-bounded sub-graph Q of G that has an expected maximum matching that approximates the expected maximum matching of G. This model does not account for possibilities of vertex dropouts, which can be found in several applications, e.g. in kidney exchange when donors or patients opt out of the exchange process as well as in online freelancing and online dating when online profiles are found to be faked. Thus, we will study a more generalized model of stochastic matching in which vertices and edges are both realized independently with some probabilities pv, pe, respectively, which more accurately fits important applications than the previously studied model. We will discuss the first algorithms and analysis for this generalization of the stochastic matching model and prove that they achieve good approximation ratios. In particular, we show that the approximation factor of a natural algorithm for this problem is at least 0.6568 in unweighted graphs, and 1/2+ε in weighted graphs for some constant ε >0. We further improve our result for unweighted graphs to 2/3 using edge degree constrained sub-graphs (EDCS). 
    more » « less
  5. Expander graphs play a central role in graph theory and algorithms. With a number of powerful algorithmic tools developed around them, such as the Cut-Matching game, expander pruning, expander decomposition, and algorithms for decremental All-Pairs Shortest Paths (APSP) in expanders, to name just a few, the use of expanders in the design of graph algorithms has become ubiquitous. Specific applications of interest to us are fast deterministic algorithms for cut problems in static graphs, and algorithms for dynamic distance-based graph problems, such as APSP. Unfortunately, the use of expanders in these settings incurs a number of drawbacks. For example, the best currently known algorithm for decremental APSP in constant-degree expanders can only achieve a (log n) O(1/ 2 ) -approximation with n 1+O( ) total update time for any . All currently known algorithms for the Cut Player in the Cut-Matching game are either randomized, or provide rather weak guarantees: expansion 1/(log n) 1/ with running time n 1+O( ) . This, in turn, leads to somewhat weak algorithmic guarantees for several central cut problems: the best current almost linear time deterministic algorithms for Sparsest Cut, Lowest Conductance Cut, and Balanced Cut can only achieve approximation factor (log n) ω(1). Lastly, when relying on expanders in distancebased problems, such as dynamic APSP, via current methods, it seems inevitable that one has to settle for approximation factors that are at least Ω(log n). In contrast, we do not have any negative results that rule out a factor-5 approximation with near-linear total update time. In this paper we propose the use of well-connected graphs, and introduce a new algorithmic toolkit for such graphs that, in a sense, mirrors the above mentioned algorithmic tools for expanders. One of these new tools is the Distanced Matching game, an analogue of the Cut-Matching game for well-connected graphs. We demonstrate the power of these new tools by obtaining better results for several of the problems mentioned above. First, we design an algorithm for decremental APSP in expanders with significantly better guarantees: in a constant-degree expander, the algorithm achieves (log n) 1+o(1)-approximation, with total update time n 1+o(1). We also obtain a deterministic algorithm for the Cut Player in the Cut-Matching game that achieves expansion 1 (log n) 5+o(1) in time n 1+o(1), deterministic almost linear-time algorithms for Sparsest Cut, Lowest-Conductance Cut, and Minimum Balanced Cut with approximation factors O(poly log n), as well as improved deterministic algorithm for Expander Decomposition. We believe that the use of well-connected graphs instead of expanders in various dynamic distance-based problems (such as APSP in general graphs) has the potential of providing much stronger guarantees, since we are no longer necessarily restricted to superlogarithmic approximation factors. 
    more » « less