NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

The ParClusterers Benchmark Suite (PCBS): A Fine-Grained Analysis of Scalable Graph Clustering

https://doi.org/10.14778/3712221.3712246

Yu, Shangdi; Shi, Jessica; Meindl, Jamison; Eisenstat, David; Ju, Xiaoen; Tavakkol, Sasan; Dhulipala, Laxman; Łącki, Jakub; Mirrokni, Vahab; Shun, Julian (November 2024, Proceedings of the VLDB Endowment)

We introduce the ParClusterers Benchmark Suite (PCBS)---a collection of highly scalable parallel graph clustering algorithms and benchmarking tools that streamline comparing different graph clustering algorithms and implementations. The benchmark includes clustering algorithms that target a wide range of modern clustering use cases, including community detection, classification, and dense subgraph mining. The benchmark toolkit makes it easy to run and evaluate multiple instances of different clustering algorithms with respect to both the running time and quality. We evaluate the PCBS algorithms empirically and find that they deliver both the state of the art quality and the running time. In terms of the running time, they are on average over 4x faster than the fastest library we compared to. In terms of quality, the correlation clustering algorithm [Shi et al., VLDB'21] optimizing for the LambdaCC objective, which does not have a direct counterpart in other libraries, delivers the highest quality in the majority of datasets that we used.
more » « less
Full Text Available
Parallel Algorithms for Hierarchical Nucleus Decomposition

https://doi.org/10.1145/3639287

Shi, Jessica; Dhulipala, Laxman; Shun, Julian (March 2024, Proceedings of the ACM on Management of Data)

Nucleus decompositions have been shown to be a useful tool for finding dense subgraphs. The coreness value of a clique represents its density based on the number of other cliques it is adjacent to. One useful output of nucleus decomposition is to generate a hierarchy among dense subgraphs at different resolutions. However, existing parallel algorithms for nucleus decomposition do not generate this hierarchy, and only compute the coreness values. This paper presents a scalable parallel algorithm for hierarchy construction, with practical optimizations, such as interleaving the coreness computation with hierarchy construction and using a concurrent union-find data structure in an innovative way to generate the hierarchy. We also introduce a parallel approximation algorithm for nucleus decomposition, which achieves much lower span in theory and better performance in practice. We prove strong theoretical bounds on the work and span (parallel time) of our algorithms. On a 30-core machine with two-way hyper-threading, our parallel hierarchy construction algorithm achieves up to a 58.84x speedup over the state-of-the-art sequential hierarchy construction algorithm by Sariyuce et al. and up to a 30.96x self-relative parallel speedup. On the same machine, our approximation algorithm achieves a 3.3x speedup over our exact algorithm, while generating coreness estimates with a multiplicative error of 1.33x on average.
more » « less
Full Text Available
Internalizing Indistinguishability with Dependent Types

https://doi.org/10.1145/3632886

Liu, Yiyun; Chan, Jonathan; Shi, Jessica; Weirich, Stephanie (January 2024, Proceedings of the ACM on Programming Languages)

In type systems with dependency tracking, programmers can assign an ordered set of levels to computations and prevent information flow from high-level computations to the low-level ones. The key notion in such systems isindistinguishability: a definition of program equivalence that takes into account the parts of the program that an observer may depend on. In this paper, we investigate the use of dependency tracking in the context of dependently-typed languages. We present the Dependent Calculus of Indistinguishability (DCOI), a system that adopts indistinguishability as the definition of equality used by the type checker. DCOI also internalizes that relation as an observer-indexed propositional equality type, so that programmers may reason about indistinguishability within the language. Our design generalizes and extends prior systems that combine dependency tracking with dependent types and is the first to support conversion and propositional equality at arbitrary observer levels. We have proven type soundness and noninterference theorems for DCOI and have developed a prototype implementation of its type checker.
more » « less
Parallel Five-cycle Counting Algorithms

https://doi.org/10.1145/3556541

Shi, Jessica; Huang, Louisa Ruixue; Shun, Julian (December 2022, ACM Journal of Experimental Algorithmics)

Counting the frequency of subgraphs in large networks is a classic research question that reveals the underlying substructures of these networks for important applications. However, subgraph counting is a challenging problem, even for subgraph sizes as small as five, due to the combinatorial explosion in the number of possible occurrences. This article focuses on the five-cycle, which is an important special case of five-vertex subgraph counting and one of the most difficult to count efficiently. We design two new parallel five-cycle counting algorithms and prove that they are work efficient and achieve polylogarithmic span. Both algorithms are based on computing low out-degree orientations, which enables the efficient computation of directed two-paths and three-paths, and the algorithms differ in the ways in which they use this orientation to eliminate double-counting. Additionally, we present new parallel algorithms for obtaining unbiased estimates of five-cycle counts using graph sparsification. We develop fast multicore implementations of the algorithms and propose a work scheduling optimization to improve their performance. Our experiments on a variety of real-world graphs using a 36-core machine with two-way hyper-threading show that our best exact parallel algorithm achieves 10–46× self-relative speedup, outperforms our serial benchmarks by 10–32×, and outperforms the previous state-of-the-art serial algorithm by up to 818×. Our best approximate algorithm, for a reasonable probability parameter, achieves up to 20× self-relative speedup and is able to approximate five-cycle counts 9–189× faster than our best exact algorithm, with between 0.52% and 11.77% error.
more » « less
Full Text Available
Efficient Algorithms for Parallel Bi-core Decomposition

https://doi.org/10.1137/1.9781611977578.ch2

Huang, Yihao; Wang, Claire; Shi, Jessica; Shun, Julian (January 2023, SIAM Symposium on Algorithmic Principles of Computer Systems (APOCS))

Full Text Available
Differential Privacy from Locally Adjustable Graph Algorithms: k-Core Decomposition, Low Out-Degree Ordering, and Densest Subgraphs

https://doi.org/10.1109/FOCS54457.2022.00077

Dhulipala, Laxman; Liu, Quanquan C.; Raskhodnikova, Sofya; Shi, Jessica; Shun, Julian; Yu, Shangdi (October 2022, Annual Symposium on Foundations of Computer Science (FOCS))

Full Text Available
Parallel Batch-Dynamic Algorithms for k-Core Decomposition and Related Graph Problems

https://doi.org/10.1145/3490148.3538569

Liu, Quanquan C.; Shi, Jessica; Yu, Shangdi; Dhulipala, Laxman; Shun, Julian (July 2022, ACM Symposium on Parallelism in Algorithms and Architectures)

Full Text Available
Etna: An Evaluation Platform for Property-Based Testing (Experience Report)

https://doi.org/10.1145/3607860

Shi, Jessica; Keles, Alperen; Goldstein, Harrison; Pierce, Benjamin_C; Lampropoulos, Leonidas (August 2023, Proceedings of the ACM on Programming Languages)

Property-based testing is a mainstay of functional programming, boasting a rich literature, an enthusiastic user community, and an abundance of tools — so many, indeed, that new users may have difficulty choosing. Moreover, any given framework may support a variety of strategies for generating test inputs; even experienced users may wonder which are better in a given situation. Sadly, the PBT literature, though long on creativity, is short on rigorous comparisons to help answer such questions. We present Etna, a platform for empirical evaluation and comparison of PBT techniques. Etna incorporates a number of popular PBT frameworks and testing workloads from the literature, and its extensible architecture makes adding new ones easy, while handling the technical drudgery of performance measurement. To illustrate its benefits, we use Etna to carry out several experiments with popular PBT approaches in both Coq and Haskell, allowing users to more clearly understand best practices and tradeoffs.
more » « less
Parallel Clique Counting and Peeling Algorithms

https://doi.org/10.1137/1.9781611976830.13

Shi, Jessica; Dhulipala, Laxman; Shun, Julian (July 2021, Proceedings of the SIAM Conference on Applied and Computational Discrete Algorithms (ACDA))

Full Text Available
Parallel Five-Cycle Counting Algorithms

https://doi.org/10.4230/LIPIcs.SEA.2021.2

Huang, Louisa Ruixue; Shi, Jessica; Shun, Julian (July 2021, 19th International Symposium on Experimental Algorithms (SEA 2021))

Full Text Available

« Prev Next »

Search for: All records