skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, May 23 until 2:00 AM ET on Friday, May 24 due to maintenance. We apologize for the inconvenience.


Title: Parallel Algorithms through Approximation: b-Edge cover
We describe a paradigm for designing parallel algorithms via approximation, and illustrate it on the b-edgecover problem. A b-edgecover of minimum weight in a graph is a subset $C$ of its edges such that at least a specified number $b(v)$ of edges in $C$ is incident on each vertex $v$, and the sum of the edge weights in $C$ is minimum. The Greedy algorithm and a variant, the LSE algorithm, provide $3/2$-approximation guarantees in the worst-case for this problem, but these algorithms have limited parallelism. Hence we design two new $2$-approximation algorithms with greater concurrency. The MCE algorithm reduces the computation of a b-edgecover to that of finding a b'-matching, by exploiting the relationship between these subgraphs in an approximation context. The LSE-NW is derived from the LSEalgorithm using static edge weights rather than dynamically computing effective edge weights. This relaxation gives LSE a worse approximation guarantee but makes it more amenable to parallelization. We prove that both the MCE and LSE-NW algorithms compute the same b-edgecover with at most twice the weight of the minimum weight edge cover. In practice, the $2$-approximation and $3/2$-approximation algorithms compute edge covers of weight within $10\%$ the optimal. We implement three of the approximation algorithms, MCE, LSE, and LSE-NW on shared memory multi-core machines, including an Intel Xeon and an IBM Power8 machine with 8 TB memory. The MCE algorithm is the fastest of these by an order of magnitude or more. It computes an edge cover in a graph with billions of edges in $20$ seconds using two hundred threads on the IBM Power8. We also show that the parallel depth and work can be bounded for the Suitor and b-Suitor algorithms when edge weights are random.  more » « less
Award ID(s):
1637534
NSF-PAR ID:
10062971
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
2018 IEEE International Parallel and Distributed Processing Symposium
Page Range / eLocation ID:
22-33
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We describe a 3/2-approximation algorithm, \lse, for computing a b-edgecover of minimum weight in a graph with weights on the edges. The b-edgecover problem is a generalization of the better-known Edge Cover problem in graphs, where the objective is to choose a subset C of edges in the graph such that at least a specified number b(v) of edges in C are incident on each vertex v. In the weighted b-edgecover problem, we minimize the sum of the weights of the edges in C. We prove that the Locally Subdominant edge (LSE) algorithm computes the same b-edge cover as the one obtained by the Greedy algorithm for the problem. However, the Greedy algorithm requires edges to be sorted by their effective weights, and these weights need to be updated after each iteration. These requirements make the Greedy algorithm sequential and impractical for massive graphs. The LSE algorithm avoids the sorting step, and is amenable for parallelization. We implement the algorithm on a serial machine and compare its performance against a collection of approximation algorithms for the b-edge cover problem. Our results show that the algorithm is 3 to 5 times faster than the Greedy algorithm on a serial processor. The approximate edge covers obtained by the LSE algorithm have weights greater by at most 17% of the optimal weight for problems where we could compute the latter. We also investigate the relationship between the b-edge cover and the b-matching problems, show that the latter has a faster implementation since edge weights are static in this algorithm, and obtain a heuristic solution for the former from the latter. 
    more » « less
  2. We consider the maximum vertex-weighted matching problem (MVM), in which non-negative weights are assigned to the vertices of a graph, and the weight of a matching is the sum of the weights of the matched vertices. Although exact algorithms for MVM are faster than exact algorithms for the maximum edge-weighted matching problem, there are graphs on which these exact algorithms could take hundreds of hours. For a natural number k, we design a k/(k + 1)approximation algorithm for MVM on non-bipartite graphs that updates the matching along certain short paths in the graph: either augmenting paths of length at most 2k + 1 or weight-increasing paths of length at most 2k. The choice of k = 2 leads to a 2/3-approximation algorithm that computes nearly optimal weights fast. This algorithm could be initialized with a 2/3-approximate maximum cardinality matching to reduce its runtime in practice. A 1/2-approximation algorithm may be obtained using k = 1, which is faster than the 2/3-approximation algorithm but it computes lower weights. The 2/3-approximation algorithm has time complexity O(Δ2m) while the time complexity of the 1/2-approximation algorithm is O(Δm), where m is the number of edges and Δ is the maximum degree of a vertex. Results from our serial implementations show that on average the 1/2-approximation algorithm runs faster than the Suitor algorithm (currently the fastest 1/2-approximation algorithm) while the 2/3-approximation algorithm runs as fast as the Suitor algorithm but obtains higher weights for the matching. One advantage of the proposed algorithms is that they are well-suited for parallel implementation since they can process vertices to match in any order. The 1/2- and 2/3-approximation algorithms have been implemented on a shared memory parallel computer, and both approximation algorithms obtain good speedups, while the former algorithm runs faster on average than the parallel Suitor algorithm. Care is needed to design the parallel algorithm to avoid cyclic waits, and we prove that it is live-lock free. 
    more » « less
  3. A b-matching is a subset of edges M such that at most b(v) edges in M are incident on each vertex v, where b(v) is specified. We present a distributed-memory parallel algorithm, \bsuitor, that computes a b-matching with more than half the maximum weight in a graph with weights on the edges. The approximation algorithm is designed to have high concurrency and low time complexity. We organize the implementation of the algorithm in terms of asynchronous super-steps that combine computation and communication, and balance the computational work and frequency of communication to obtain high performance. Since the performance of the b-suitor algorithm is strongly influenced by communication, we present several strategies to reduce the communication volume. We implement the algorithm using a hybrid strategy where inter-node communication uses MPI and intra-node computation is done with OpenMP threads. We demonstrate strong and weak scaling of b-suitor up to 16,000 processors on two supercomputers at NERSC. We compute a b-matching in a graph with 2 billion edges in under 4 seconds using 16,000 processors. 
    more » « less
  4. We present a weighted approach to compute a maximum cardinality matching in an arbitrary bipartite graph. Our main result is a new algorithm that takes as input a weighted bipartite graph G(A cup B,E) with edge weights of 0 or 1. Let w <= n be an upper bound on the weight of any matching in G. Consider the subgraph induced by all the edges of G with a weight 0. Suppose every connected component in this subgraph has O(r) vertices and O(mr/n) edges. We present an algorithm to compute a maximum cardinality matching in G in O~(m(sqrt{w} + sqrt{r} + wr/n)) time. When all the edge weights are 1 (symmetrically when all weights are 0), our algorithm will be identical to the well-known Hopcroft-Karp (HK) algorithm, which runs in O(m sqrt{n}) time. However, if we can carefully assign weights of 0 and 1 on its edges such that both w and r are sub-linear in n and wr=O(n^{gamma}) for gamma < 3/2, then we can compute maximum cardinality matching in G in o(m sqrt{n}) time. Using our algorithm, we obtain a new O~(n^{4/3}/epsilon^4) time algorithm to compute an epsilon-approximate bottleneck matching of A,B subsetR^2 and an 1/(epsilon^{O(d)}}n^{1+(d-1)/(2d-1)}) poly log n time algorithm for computing epsilon-approximate bottleneck matching in d-dimensions. All previous algorithms take Omega(n^{3/2}) time. Given any graph G(A cup B,E) that has an easily computable balanced vertex separator for every subgraph G'(V',E') of size |V'|^{delta}, for delta in [1/2,1), we can apply our algorithm to compute a maximum matching in O~(mn^{delta/1+delta}) time improving upon the O(m sqrt{n}) time taken by the HK-Algorithm. 
    more » « less
  5. We explore the problem of sharing data that pertains to individuals with anonymity guarantees, where each user requires a desired level of privacy. We propose the first shared-memory as well as distributed memory parallel algorithms for the k-anonimity problem that achieves this goal, and produces high quality anonymized datasets. The new algorithm is based on an optimization procedure that iteratively computes weights on the edges of a dissimilarity matrix, and at each iteration computes a minimum weighted b-edgecover in the graph. We describe how a 2-approximation algorithm for computing the b-edgecover can be used to solve the adaptive anonymity problem in parallel. We are able to solve adaptive anonymity problems with hundreds of thousands of instances and hundreds of features on a supercomputer in under five minutes. Our algorithm scales up to 8000 cores on a distributed memory supercomputer, while also providing good speedups on shared memory multiprocessors. On smaller problems where an a Belief Propagation algorithm is feasible, our algorithm is two orders of magnitude faster. 
    more » « less