NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Comparison of LSM indexing techniques for storing spatial data

https://doi.org/10.1186/s40537-023-00734-3

Mao, Qizhong; Qader, Mohiuddin Abdul; Hristidis, Vagelis (April 2023, Journal of Big Data)

Abstract In the pre-big data era, many traditional databases supported spatial queries via spatial indexes. However, modern applications are seeing a rapid increase of the volume and ingestion rate of spatial data. Log-structured Merge (LSM) tree is used by many big data systems as their storage structure in order to support write-intensive large-volume workloads, which are usually only optimized for single-dimensional data. Research has studied how spatial indexes can be supported on LSM systems, but focused mainly on the local index organization, that is, how data is organized inside a single LSM component. This paper studies various aspects of LSM spatial indexing, including spatial merge policies, which determine when and how spatial components are merged. Three stack-based and one leveled merge policies have been studied, which have been implemented on a common big data system Apache AsterixDB. The write and read performance on various workloads is evaluated, and our findings and recommendations are discussed. A key finding is that Leveled policies underperform other stack-based merge policies for most types of spatial workloads.
more » « less
Parallel Point-to-Point Shortest Paths and Batch Queries

https://doi.org/10.1145/3694906.3743311

Dong, Xiaojun; Li, Andy; Gu, Yan; Sun, Yihan (July 2025, ACM)

Free, publicly-accessible full text available July 16, 2026
Parallel k -Core Decomposition: Theory and Practice

https://doi.org/10.1145/3725332

Liu, Youzhe; Dong, Xiaojun; Gu, Yan; Sun, Yihan (June 2025, Proceedings of the ACM on Management of Data)

This paper proposes efficient solutions for k-core decomposition with high parallelism. The problem of k-core decomposition is fundamental in graph analysis and has applications across various domains. However, existing algorithms face significant challenges in achieving work-efficiency in theory and/or high parallelism in practice, and suffer from various performance bottlenecks. We present a simple, work-efficient parallel framework for k-core decomposition that is easy to implement and adaptable to various strategies for improving work-efficiency. We introduce two techniques to enhance parallelism: a sampling scheme to reduce contention on high-degree vertices, and vertical granularity control (VGC) to mitigate scheduling overhead for low-degree vertices. Furthermore, we design a hierarchical bucket structure to optimize performance for graphs with high coreness values. We evaluate our algorithm on a diverse set of real-world and synthetic graphs. Compared to state-of-the-art parallel algorithms, including ParK, PKC, and Julienne, our approach demonstrates superior performance on 23 out of 25 graphs when tested on a 96-core machine. Our algorithm shows speedups of up to 315× over ParK, 33.4× over PKC, and 52.5× over Julienne.
more » « less
Free, publicly-accessible full text available June 17, 2026
Parallel Contraction Hierarchies Can Be Efficient and Scalable

https://doi.org/10.1145/3721145.3725744

Wan, Zijin; Dong, Xiaojun; Wang, Letong; Zhu, Enzuo; Gu, Yan; Sun, Yihan (June 2025, ACM)

Free, publicly-accessible full text available June 8, 2026
Parallel kd-tree with Batch Updates

https://doi.org/10.1145/3709712

Men, Ziyang; Shen, Zheqi; Gu, Yan; Sun, Yihan (February 2025, Proceedings of the ACM on Management of Data)

The kd-tree is one of the most widely used data structures to manage multi-dimensional data. Due to the ever-growing data volume, it is imperative to consider parallelism in kd-trees. However, we observed challenges in existing parallel kd-tree implementations, for both constructions and updates. The goal of this paper is to develop efficient in-memory kd-trees by supporting high parallelism and cache-efficiency. We propose the Pkd-tree (Parallel kd-tree), a parallel kd-tree that is efficient both in theory and in practice. The Pkd-tree supports parallel tree construction, batch update (insertion and deletion), and various queries including k-nearest neighbor search, range query, and range count. We proved that our algorithms have strong theoretical bounds in work (sequential time complexity), span (parallelism), and cache complexity. Our key techniques include 1) an efficient construction algorithm that optimizes work, span, and cache complexity simultaneously, and 2) reconstruction-based update algorithms that guarantee the tree to be weight-balanced. With the new algorithmic insights and careful engineering effort, we achieved a highly optimized implementation of the Pkd-tree. We tested Pkd-tree with various synthetic and real-world datasets, including both uniform and highly skewed data. We compare the Pkd-tree with state-of-the-art parallel kd-tree implementations. In all tests, with better or competitive query performance, Pkd-tree is much faster in construction and updates consistently than all baselines. We released our code.
more » « less
Free, publicly-accessible full text available February 10, 2026
New Algorithms for Incremental Minimum Spanning Trees and Temporal Graph Applications

https://doi.org/10.1137/1.9781611978759.22

Ding, Xiangyun; Gu, Yan; Sun, Yihan (January 2025, Society for Industrial and Applied Mathematics)

Full Text Available
Parallel Cluster-BFS and Applications to Shortest Paths

https://doi.org/10.1137/1.9781611978339.4

Wang, Letong; Blelloch, Guy; Gu, Yan; Sun, Yihan (January 2025, Society for Industrial and Applied Mathematics)

Full Text Available
Parallel and (Nearly) Work-Efficient Dynamic Programming

https://doi.org/10.1145/3626183.3659958

Ding, Xiangyun; Gu, Yan; Sun, Yihan (June 2024, ACM)

Full Text Available
Brief Announcement: PASGAL: Parallel And Scalable Graph Algorithm Library

https://doi.org/10.1145/3626183.3660258

Dong, Xiaojun; Gu, Yan; Sun, Yihan; Wang, Letong (June 2024, ACM)

Full Text Available
Teaching Parallel Algorithms Using the Binary-Forking Model

https://doi.org/10.1109/IPDPSW63119.2024.00080

Blelloch, Guy E; Gu, Yan; Sun, Yihan (May 2024, IEEE)

Full Text Available

« Prev Next »

Search for: All records