skip to main content

This content will become publicly available on August 1, 2022

Title: A Parallel Approximation Algorithm for Maximizing Submodular b-Matching
We design new serial and parallel approximation algorithms for computing a maximum weight b-matching in an edge-weighted graph with a submodular objective function. This problem is NP-hard; the new algorithms have approximation ratio 1/3, and are relaxations of the Greedy algorithm that rely only on local information in the graph, making them parallelizable. We have designed and implemented Local Lazy Greedy algorithms for both serial and parallel computers. We have applied the approximate submodular b-matching algorithm to assign tasks to processors in the computation of Fock matrices in quantum chemistry on parallel computers. The assignment seeks to reduce the run time by balancing the computational load on the processors and bounding the number of messages that each processor sends. We show that the new assignment of tasks to processors provides a four fold speedup over the currently used assignment in the NWChemEx software on 8000 processors on the Summit supercomputer at Oak Ridge National Lab.
Authors:
; ; ; ;
Editors:
Bender, M.; Gilbert, J.; Hendrickson, B.; Sullivan, B.
Award ID(s):
1637534
Publication Date:
NSF-PAR ID:
10300888
Journal Name:
Proceedings of the 2021 SIAM Conference on Applied and Computational Discrete Algorithms (ACDA21)
Volume:
1
Issue:
1
Page Range or eLocation-ID:
45-56
Sponsoring Org:
National Science Foundation
More Like this
  1. We describe a 3/2-approximation algorithm, \lse, for computing a b-edgecover of minimum weight in a graph with weights on the edges. The b-edgecover problem is a generalization of the better-known Edge Cover problem in graphs, where the objective is to choose a subset C of edges in the graph such that at least a specified number b(v) of edges in C are incident on each vertex v. In the weighted b-edgecover problem, we minimize the sum of the weights of the edges in C. We prove that the Locally Subdominant edge (LSE) algorithm computes the same b-edge cover as themore »one obtained by the Greedy algorithm for the problem. However, the Greedy algorithm requires edges to be sorted by their effective weights, and these weights need to be updated after each iteration. These requirements make the Greedy algorithm sequential and impractical for massive graphs. The LSE algorithm avoids the sorting step, and is amenable for parallelization. We implement the algorithm on a serial machine and compare its performance against a collection of approximation algorithms for the b-edge cover problem. Our results show that the algorithm is 3 to 5 times faster than the Greedy algorithm on a serial processor. The approximate edge covers obtained by the LSE algorithm have weights greater by at most 17% of the optimal weight for problems where we could compute the latter. We also investigate the relationship between the b-edge cover and the b-matching problems, show that the latter has a faster implementation since edge weights are static in this algorithm, and obtain a heuristic solution for the former from the latter.« less
  2. We describe a paradigm for designing parallel algorithms via approximation, and illustrate it on the b-edgecover problem. A b-edgecover of minimum weight in a graph is a subset $C$ of its edges such that at least a specified number $b(v)$ of edges in $C$ is incident on each vertex $v$, and the sum of the edge weights in $C$ is minimum. The Greedy algorithm and a variant, the LSE algorithm, provide $3/2$-approximation guarantees in the worst-case for this problem, but these algorithms have limited parallelism. Hence we design two new $2$-approximation algorithms with greater concurrency. The MCE algorithm reduces themore »computation of a b-edgecover to that of finding a b'-matching, by exploiting the relationship between these subgraphs in an approximation context. The LSE-NW is derived from the LSEalgorithm using static edge weights rather than dynamically computing effective edge weights. This relaxation gives LSE a worse approximation guarantee but makes it more amenable to parallelization. We prove that both the MCE and LSE-NW algorithms compute the same b-edgecover with at most twice the weight of the minimum weight edge cover. In practice, the $2$-approximation and $3/2$-approximation algorithms compute edge covers of weight within $10\%$ the optimal. We implement three of the approximation algorithms, MCE, LSE, and LSE-NW on shared memory multi-core machines, including an Intel Xeon and an IBM Power8 machine with 8 TB memory. The MCE algorithm is the fastest of these by an order of magnitude or more. It computes an edge cover in a graph with billions of edges in $20$ seconds using two hundred threads on the IBM Power8. We also show that the parallel depth and work can be bounded for the Suitor and b-Suitor algorithms when edge weights are random.« less
  3. We survey recent work on approximation algorithms for computing degree-constrained subgraphs in graphs and their applications in combinatorial scientific computing. The problems we consider include maximization versions of cardinality matching, edge-weighted matching, vertex-weighted matching and edge-weighted $b$ -matching, and minimization versions of weighted edge cover and $b$ -edge cover. Exact algorithms for these problems are impractical for massive graphs with several millions of edges. For each problem we discuss theoretical foundations, the design of several linear or near-linear time approximation algorithms, their implementations on serial and parallel computers, and applications. Our focus is on practical algorithms that yield good performancemore »on modern computer architectures with multiple threads and interconnected processors. We also include information about the software available for these problems.« less
  4. We consider how to generate graphs of arbitrary size whose chromatic numbers can be chosen (or are well-bounded) for testing graph coloring algorithms on parallel computers. For the distance-1 graph coloring problem, we identify three classes of graphs with this property. The first is the Erdős-Rényi random graph with prescribed expected degree, where the chromatic number is known with high probability. It is also known that the Greedy algorithm colors this graph using at most twice the number of colors as the chromatic number. The second is a random geometric graph embedded in hyperbolic space where the size of themore »maximum clique provides a tight lower bound on the chromatic number. The third is a deterministic graph described by Mycielski, where the graph is recursively constructed such that its chromatic number is known and increases with graph size, although the size of the maximum clique remains two. For Jacobian estimation, we bound the distance-2 chromatic number of random bipartite graphs by considering its equivalence to distance-1 coloring of an intersection graph. We use a “balls and bins” probabilistic analysis to establish a lower bound and an upper bound on the distance-2 chromatic number. The regimes of graph sizes and probabilities that we consider are chosen to suit the Jacobian estimation problem, where the number of columns and rows are asymptotically nearly equal, and have number of nonzeros linearly related to the number of columns. Computationally we verify the theoretical predictions and show that the graphs are often be colored optimally by the serial and parallel Greedy algorithms.« less
  5. We consider the maximum vertex-weighted matching problem (MVM), in which non-negative weights are assigned to the vertices of a graph, and the weight of a matching is the sum of the weights of the matched vertices. Although exact algorithms for MVM are faster than exact algorithms for the maximum edge-weighted matching problem, there are graphs on which these exact algorithms could take hundreds of hours. For a natural number k, we design a k/(k + 1)approximation algorithm for MVM on non-bipartite graphs that updates the matching along certain short paths in the graph: either augmenting paths of length at mostmore »2k + 1 or weight-increasing paths of length at most 2k. The choice of k = 2 leads to a 2/3-approximation algorithm that computes nearly optimal weights fast. This algorithm could be initialized with a 2/3-approximate maximum cardinality matching to reduce its runtime in practice. A 1/2-approximation algorithm may be obtained using k = 1, which is faster than the 2/3-approximation algorithm but it computes lower weights. The 2/3-approximation algorithm has time complexity O(Δ2m) while the time complexity of the 1/2-approximation algorithm is O(Δm), where m is the number of edges and Δ is the maximum degree of a vertex. Results from our serial implementations show that on average the 1/2-approximation algorithm runs faster than the Suitor algorithm (currently the fastest 1/2-approximation algorithm) while the 2/3-approximation algorithm runs as fast as the Suitor algorithm but obtains higher weights for the matching. One advantage of the proposed algorithms is that they are well-suited for parallel implementation since they can process vertices to match in any order. The 1/2- and 2/3-approximation algorithms have been implemented on a shared memory parallel computer, and both approximation algorithms obtain good speedups, while the former algorithm runs faster on average than the parallel Suitor algorithm. Care is needed to design the parallel algorithm to avoid cyclic waits, and we prove that it is live-lock free.« less