Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

Bender, M. ; Gilbert, J. ; Hendrickson, B. ; Sullivan, B. (Ed.)We design new serial and parallel approximation algorithms for computing a maximum weight bmatching in an edgeweighted graph with a submodular objective function. This problem is NPhard; the new algorithms have approximation ratio 1/3, and are relaxations of the Greedy algorithm that rely only on local information in the graph, making them parallelizable. We have designed and implemented Local Lazy Greedy algorithms for both serial and parallel computers. We have applied the approximate submodular bmatching algorithm to assign tasks to processors in the computation of Fock matrices in quantum chemistry on parallel computers. The assignment seeks to reduce the run time by balancing the computational load on the processors and bounding the number of messages that each processor sends. We show that the new assignment of tasks to processors provides a four fold speedup over the currently used assignment in the NWChemEx software on 8000 processors on the Summit supercomputer at Oak Ridge National Lab.more » « less

We consider the maximum vertexweighted matching problem (MVM), in which nonnegative weights are assigned to the vertices of a graph, and the weight of a matching is the sum of the weights of the matched vertices. Although exact algorithms for MVM are faster than exact algorithms for the maximum edgeweighted matching problem, there are graphs on which these exact algorithms could take hundreds of hours. For a natural number k, we design a k/(k + 1)approximation algorithm for MVM on nonbipartite graphs that updates the matching along certain short paths in the graph: either augmenting paths of length at most 2k + 1 or weightincreasing paths of length at most 2k. The choice of k = 2 leads to a 2/3approximation algorithm that computes nearly optimal weights fast. This algorithm could be initialized with a 2/3approximate maximum cardinality matching to reduce its runtime in practice. A 1/2approximation algorithm may be obtained using k = 1, which is faster than the 2/3approximation algorithm but it computes lower weights. The 2/3approximation algorithm has time complexity O(Δ2m) while the time complexity of the 1/2approximation algorithm is O(Δm), where m is the number of edges and Δ is the maximum degree of a vertex. Results from our serial implementations show that on average the 1/2approximation algorithm runs faster than the Suitor algorithm (currently the fastest 1/2approximation algorithm) while the 2/3approximation algorithm runs as fast as the Suitor algorithm but obtains higher weights for the matching. One advantage of the proposed algorithms is that they are wellsuited for parallel implementation since they can process vertices to match in any order. The 1/2 and 2/3approximation algorithms have been implemented on a shared memory parallel computer, and both approximation algorithms obtain good speedups, while the former algorithm runs faster on average than the parallel Suitor algorithm. Care is needed to design the parallel algorithm to avoid cyclic waits, and we prove that it is livelock free.more » « less

We present the augmented matrix for principal submatrix update (AMPS) algorithm, a finite element solution method that combines principal submatrix updates and Schur complement techniques, wellsuited for interactive simulations of deformation and cutting of finite element meshes. Our approach features realtime solutions to the updated stiffness matrix systems to account for interactive changes in mesh connectivity and boundary conditions. Updates are accomplished by an augmented matrix formulation of the stiffness equations to maintain its consistency with changes to the underlying model without refactorization at each timestep. As changes accumulate over multiple simulation timesteps, the augmented solution algorithm enables tens or hundreds of updates per second. Acceleration schemes that exploit sparsity, memoization and parallelization lead to the updates being computed in realtime. The complexity analysis and experimental results for this method demonstrate that it scales linearly with the number of nonzeros of the factors of the stiffness matrix. Results for cutting and deformation of 3D elastic models are reported for meshes with up to 50,000 nodes, and involve models of surgery for astigmatism and the brain.more » « less

We consider how to generate graphs of arbitrary size whose chromatic numbers can be chosen (or are wellbounded) for testing graph coloring algorithms on parallel computers. For the distance1 graph coloring problem, we identify three classes of graphs with this property. The first is the ErdősRényi random graph with prescribed expected degree, where the chromatic number is known with high probability. It is also known that the Greedy algorithm colors this graph using at most twice the number of colors as the chromatic number. The second is a random geometric graph embedded in hyperbolic space where the size of the maximum clique provides a tight lower bound on the chromatic number. The third is a deterministic graph described by Mycielski, where the graph is recursively constructed such that its chromatic number is known and increases with graph size, although the size of the maximum clique remains two. For Jacobian estimation, we bound the distance2 chromatic number of random bipartite graphs by considering its equivalence to distance1 coloring of an intersection graph. We use a “balls and bins” probabilistic analysis to establish a lower bound and an upper bound on the distance2 chromatic number. The regimes of graph sizes and probabilities that we consider are chosen to suit the Jacobian estimation problem, where the number of columns and rows are asymptotically nearly equal, and have number of nonzeros linearly related to the number of columns. Computationally we verify the theoretical predictions and show that the graphs are often be colored optimally by the serial and parallel Greedy algorithms.more » « less

We survey recent work on approximation algorithms for computing degreeconstrained subgraphs in graphs and their applications in combinatorial scientific computing. The problems we consider include maximization versions of cardinality matching, edgeweighted matching, vertexweighted matching and edgeweighted $b$ matching, and minimization versions of weighted edge cover and $b$ edge cover. Exact algorithms for these problems are impractical for massive graphs with several millions of edges. For each problem we discuss theoretical foundations, the design of several linear or nearlinear time approximation algorithms, their implementations on serial and parallel computers, and applications. Our focus is on practical algorithms that yield good performance on modern computer architectures with multiple threads and interconnected processors. We also include information about the software available for these problems.more » « less

We explore the problem of sharing data that pertains to individuals with anonymity guarantees, where each user requires a desired level of privacy. We propose the first sharedmemory as well as distributed memory parallel algorithms for the kanonimity problem that achieves this goal, and produces high quality anonymized datasets. The new algorithm is based on an optimization procedure that iteratively computes weights on the edges of a dissimilarity matrix, and at each iteration computes a minimum weighted bedgecover in the graph. We describe how a 2approximation algorithm for computing the bedgecover can be used to solve the adaptive anonymity problem in parallel. We are able to solve adaptive anonymity problems with hundreds of thousands of instances and hundreds of features on a supercomputer in under five minutes. Our algorithm scales up to 8000 cores on a distributed memory supercomputer, while also providing good speedups on shared memory multiprocessors. On smaller problems where an a Belief Propagation algorithm is feasible, our algorithm is two orders of magnitude faster.more » « less

We present an automated pipeline capable of distinguishing the phenotypes of myeloidderived suppressor cells (MDSC) in healthy and tumorbearing tissues in mice using flow cytometry data. In contrast to earlier work where samples are analyzed individually, we analyze all samples from each tissue collectively using a representative template for it. We demonstrate with 43 flow cytometry samples collected from three tissues, naive bonemarrow, spleens of tumorbearing mice, and intraperitoneal tumor, that a set of templates serves as a better classifier than popular machine learning approaches including support vector machines and neural networks. Our "interpretable machine learning" approach goes beyond classification and identifies distinctive phenotypes associated with each tissue, information that is clinically useful. Hence the pipeline presented here leads to better understanding of the maturation and differentiation of MDSCs using highthroughput data.more » « less

We describe two new 3/2approximation algorithms and a new 2approximation algorithm for the minimum weight edge cover problem in graphs. We show that one of the 3/2approximation algorithms, the Dual cover algorithm, computes the lowest weight edge cover relative to previously known algorithms as well as the new algorithms reported here. The Dual cover algorithm can also be implemented to be faster than the other 3/2approximation algorithms on serial computers. Many of these algorithms can be extended to solve the 6Edge cover problem as well. We show the relation of these algorithms to the KNearest Neighbor graph construction in semisupervised learning and other applications.more » « less