On Approximability of 𝓁₂² Min-Sum Clustering

S, Karthik C; Lee, Euiwoong; Rabani, Yuval; Schwiegelshohn, Chris; Zhou, Samson

doi:10.4230/lipics.socg.2025.62

Citation Details

On Approximability of 𝓁₂² Min-Sum Clustering

The 𝓁₂² min-sum k-clustering problem is to partition an input set into clusters C_1,…,C_k to minimize ∑_{i=1}^k ∑_{p,q ∈ C_i} ‖p-q‖₂². Although 𝓁₂² min-sum k-clustering is NP-hard, it is not known whether it is NP-hard to approximate 𝓁₂² min-sum k-clustering beyond a certain factor. In this paper, we give the first hardness-of-approximation result for the 𝓁₂² min-sum k-clustering problem. We show that it is NP-hard to approximate the objective to a factor better than 1.056 and moreover, assuming a balanced variant of the Johnson Coverage Hypothesis, it is NP-hard to approximate the objective to a factor better than 1.327. We then complement our hardness result by giving a fast PTAS for 𝓁₂² min-sum k-clustering. Specifically, our algorithm runs in time O(n^{1+o(1)}d⋅ 2^{(k/ε)^O(1)}), which is the first nearly linear time algorithm for this problem. We also consider a learning-augmented setting, where the algorithm has access to an oracle that outputs a label i ∈ [k] for input point, thereby implicitly partitioning the input dataset into k clusters that induce an approximately optimal solution, up to some amount of adversarial error α ∈ [0,1/2). We give a polynomial-time algorithm that outputs a (1+γα)/(1-α)²-approximation to 𝓁₂² min-sum k-clustering, for a fixed constant γ > 0. more »

Award ID(s):: 2443697

PAR ID:: 10646612

Author(s) / Creator(s):: S, Karthik C; Lee, Euiwoong; Rabani, Yuval; Schwiegelshohn, Chris; Zhou, Samson

Editor(s):: Aichholzer, Oswin; Wang, Haitao

Publisher / Repository:: Schloss Dagstuhl – Leibniz-Zentrum für Informatik

Date Published:: 2025-01-01

Volume:: 332

ISSN:: 1868-8969

Page Range / eLocation ID:: 62:1-62:18

Subject(s) / Keyword(s):: Clustering hardness of approximation polynomial-time approximation schemes learning-augmented algorithms Theory of computation → Computational geometry Theory of computation → Facility location and clustering

Format(s):: Medium: X Size: 18 pages; 1010695 bytes Other: application/pdf

Size(s):: 18 pages 1010695 bytes

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.4230/lipics.socg.2025.62

More Like this