KNN-DBSCAN: a DBSCAN in high dimensions

Chen, Youguang  (ORCID:0000000306336691); Ruys, William  (ORCID:000000015702022X); Biros, George  (ORCID:0000000200333994)

doi:10.1145/3701624

Citation Details

This content will become publicly available on March 31, 2026

KNN-DBSCAN: a DBSCAN in high dimensions

Clustering is a fundamental task in machine learning. One of the most successful and broadly used algorithms is DBSCAN, a density-based clustering algorithm. DBSCAN requires ϵ-nearest neighbor graphs of the input dataset, which are computed with range-search algorithms and spatial data structures like KD-trees. Despite many efforts to design scalable implementations for DBSCAN, existing work is limited to low-dimensional datasets, as constructing ϵ-nearest neighbor graphs can be expensive in high-dimensions. This article introduces a modified DBSCAN, usingk-nearest neighbor (kNN) graphs to improve efficiency. We outline conditions forkNN-DBSCAN to match DBSCAN’s results and present a parallel implementation using OpenMP and MPI for shared and distributed memory systems. Testing on datasets up to 32 dimensions, we achieve remarkable scalability. Our implementation clusters one billion 3D points in under one second on 28K cores at TACC’s Frontera system. In a larger run, we cluster 65 billion points in 20 dimensions in under 40 seconds using 114,688 cores. Our method is up to 37× faster than state-of-the-art parallel DBSCAN on a 20-dimensional dataset with 4 million points. Code is available athttps://github.com/ut-padas/knndbscan. more »

Award ID(s):: 2204226

PAR ID:: 10642509

Author(s) / Creator(s):: Chen, Youguang ; Ruys, William ; Biros, George

Publisher / Repository:: Association for Computing Machinery

Date Published:: 2025-03-31

Journal Name:: ACM Transactions on Parallel Computing

Volume:: 12

Issue:: 1

ISSN:: 2329-4949

Page Range / eLocation ID:: 1 to 27

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on March 31, 2026
Journal Article:
https://doi.org/10.1145/3701624

More Like this