It’s Hard to HAC Average Linkage!

Bateni, MohammadHossein; Dhulipala, Laxman; Gowda, Kishen N; Hershkowitz, D Ellis; Jayaram, Rajesh; Łącki, Jakub

doi:10.4230/LIPIcs.ICALP.2024.18

Citation Details

It’s Hard to HAC Average Linkage!

Average linkage Hierarchical Agglomerative Clustering (HAC) is an extensively studied and applied method for hierarchical clustering. Recent applications to massive datasets have driven significant interest in near-linear-time and efficient parallel algorithms for average linkage HAC. We provide hardness results that rule out such algorithms. On the sequential side, we establish a runtime lower bound of n^{3/2-ε} on n node graphs for sequential combinatorial algorithms under standard fine-grained complexity assumptions. This essentially matches the best-known running time for average linkage HAC. On the parallel side, we prove that average linkage HAC likely cannot be parallelized even on simple graphs by showing that it is CC-hard on trees of diameter 4. On the possibility side, we demonstrate that average linkage HAC can be efficiently parallelized (i.e., it is in NC) on paths and can be solved in near-linear time when the height of the output cluster hierarchy is small. more »

Award ID(s):: 2317194

PAR ID:: 10609048

Author(s) / Creator(s):: Bateni, MohammadHossein; Dhulipala, Laxman; Gowda, Kishen N; Hershkowitz, D Ellis; Jayaram, Rajesh; Łącki, Jakub

Editor(s):: Bringmann, Karl; Grohe, Martin; Puppis, Gabriele; Svensson, Ola

Publisher / Repository:: Schloss Dagstuhl – Leibniz-Zentrum für Informatik

Date Published:: 2024-01-01

Volume:: 297

ISSN:: 1868-8969

ISBN:: 978-3-95977-322-5

Page Range / eLocation ID:: 18:1-18:18

Subject(s) / Keyword(s):: Clustering Hierarchical Graph Clustering HAC Fine-Grained Complexity Parallel Algorithms CC Theory of computation → Parallel algorithms Theory of computation → Streaming, sublinear and near linear time algorithms Theory of computation → Graph algorithms analysis

Format(s):: Medium: X Size: 18 pages; 1997336 bytes Other: application/pdf

Size(s):: 18 pages 1997336 bytes

Right(s):: Creative Commons Attribution 4.0 International license; info:eu-repo/semantics/openAccess

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.4230/LIPIcs.ICALP.2024.18

More Like this