NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Parallel Contraction Hierarchies Can Be Efficient and Scalable

https://doi.org/10.1145/3721145.3725744

Wan, Zijin; Dong, Xiaojun; Wang, Letong; Zhu, Enzuo; Gu, Yan; Sun, Yihan (June 2025, ACM)

Free, publicly-accessible full text available June 8, 2026
Parallel Cluster-BFS and Applications to Shortest Paths

https://doi.org/10.1137/1.9781611978339.4

Wang, Letong; Blelloch, Guy; Gu, Yan; Sun, Yihan (January 2025, Society for Industrial and Applied Mathematics)

Full Text Available
Brief Announcement: PASGAL: Parallel And Scalable Graph Algorithm Library

https://doi.org/10.1145/3626183.3660258

Dong, Xiaojun; Gu, Yan; Sun, Yihan; Wang, Letong (June 2024, ACM)

Full Text Available
Fast and Space-Efficient Parallel Algorithms for Influence Maximization

https://doi.org/10.14778/3632093.3632104

Wang, Letong; Ding, Xiangyun; Gu, Yan; Sun, Yihan (November 2023, Proceedings of the VLDB Endowment)

Influence Maximization (IM) is a crucial problem in data science. The goal is to find a fixed-size set of highly influentialseedvertices on a network to maximize the influence spread along the edges. While IM is NP-hard on commonly used diffusion models, a greedy algorithm can achieve (1 - 1/e)-approximation by repeatedly selecting the vertex with the highestmarginal gainin influence as the seed. However, we observe two performance issues in the existing work that prevent them from scaling to today's large-scale graphs: space-inefficient memorization to estimate marginal gain, and time-inefficient seed selection process due to a lack of parallelism. This paper significantly improves the scalability of IM using two key techniques. The first is asketch-compressiontechnique for the independent cascading model on undirected graphs. It allows combining the simulation and sketching approaches to achieve a time-space tradeoff. The second technique includes new data structures for parallel seed selection. Using our new approaches, we implementedPaC-IM: Parallel and Compressed IM. We comparePaC-IMwith state-of-the-art parallel IM systems on a 96-core machine with 1.5TB memory.PaC-IMcan process the ClueWeb graph with 978M vertices and 75B edges in about 2 hours. On average, across all tested graphs, our uncompressed version is 5--18x faster and about 1.4x more space-efficient than existing parallel IM systems. Using compression further saves 3.8x space with only 70% overhead in time on average.
more » « less
Full Text Available
Provably Fast and Space-Efficient Parallel Biconnectivity

https://doi.org/10.1145/3572848.3577483

Dong, Xiaojun; Wang, Letong; Gu, Yan; Sun, Yihan (February 2023, ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming)

Full Text Available
Parallel Cover Trees and their Applications

https://doi.org/10.1145/3490148.3538581

Gu, Yan; Napier, Zachary; Sun, Yihan; Wang, Letong (July 2022, ACM Symposium on Parallelism in Algorithms and Architectures)

Full Text Available
Parallel Strong Connectivity Based on Faster Reachability

https://doi.org/10.1145/3589259

Wang, Letong; Dong, Xiaojun; Gu, Yan; Sun, Yihan (June 2023, Proceedings of the ACM on Management of Data)

Computing strongly connected components (SCC) is among the most fundamental problems in graph analytics. Given the large size of today's real-world graphs, parallel SCC implementation is increasingly important. SCC is challenging in the parallel setting and is particularly hard on large-diameter graphs. Many existing parallel SCC implementations can be even slower than Tarjan's sequential algorithm on large-diameter graphs. To tackle this challenge, we propose an efficient parallel SCC implementation using a new parallel reachability approach. Our solution is based on a novel idea referred to as vertical granularity control (VGC). It breaks the synchronization barriers to increase parallelism and hide scheduling overhead. To use VGC in our SCC algorithm, we also design an efficient data structure called the parallel hash bag. It uses parallel dynamic resizing to avoid redundant work in maintaining frontiers (vertices processed in a round). We implement the parallel SCC algorithm by Blelloch et al. (J. ACM, 2020) using our new parallel reachability approach. We compare our implementation to the state-of-the-art systems, including GBBS, iSpan, Multi-step, and our highly optimized Tarjan's (sequential) algorithm, on 18 graphs, including social, web, k-NN, and lattice graphs. On a machine with 96 cores, our implementation is the fastest on 16 out of 18 graphs. On average (geometric means) over all graphs, our SCC is 6.0× faster than the best previous parallel code (GBBS), 12.8× faster than Tarjan's sequential algorithms, and 2.7× faster than the best existing implementation on each graph. We believe that our techniques are of independent interest. We also apply our parallel hash bag and VGC scheme to other graph problems, including connectivity and least-element lists (LE-lists). Our implementations improve the performance of the state-of-the-art parallel implementations for these two problems.
more » « less

Search for: All records