Aichholzer, Oswin; Wang, Haitao
(Ed.)
The πβΒ² min-sum k-clustering problem is to partition an input set into clusters C_1,β¦,C_k to minimize β_{i=1}^k β_{p,q β C_i} βp-qββΒ². Although πβΒ² min-sum k-clustering is NP-hard, it is not known whether it is NP-hard to approximate πβΒ² min-sum k-clustering beyond a certain factor. In this paper, we give the first hardness-of-approximation result for the πβΒ² min-sum k-clustering problem. We show that it is NP-hard to approximate the objective to a factor better than 1.056 and moreover, assuming a balanced variant of the Johnson Coverage Hypothesis, it is NP-hard to approximate the objective to a factor better than 1.327. We then complement our hardness result by giving a fast PTAS for πβΒ² min-sum k-clustering. Specifically, our algorithm runs in time O(n^{1+o(1)}dβ
2^{(k/Ξ΅)^O(1)}), which is the first nearly linear time algorithm for this problem. We also consider a learning-augmented setting, where the algorithm has access to an oracle that outputs a label i β [k] for input point, thereby implicitly partitioning the input dataset into k clusters that induce an approximately optimal solution, up to some amount of adversarial error Ξ± β [0,1/2). We give a polynomial-time algorithm that outputs a (1+Ξ³Ξ±)/(1-Ξ±)Β²-approximation to πβΒ² min-sum k-clustering, for a fixed constant Ξ³ > 0.
more »
« less
An official website of the United States government

