Performance of Johnson-Lindenstrauss transform for k-means and k-medians clustering

Makarychev, Konstantin; Makarychev, Yury; Razenshteyn, Ilya

doi:10.1145/3313276.3316350

Citation Details

Performance of Johnson-Lindenstrauss transform for k-means and k-medians clustering

Consider an instance of Euclidean k-means or k-medians clustering. We show that the cost of the optimal solution is preserved up to a factor of (1+ε) under a projection onto a random O(log(k /ε) / ε2)-dimensional subspace. Further, the cost of every clustering is preserved within (1+ε). More generally, our result applies to any dimension reduction map satisfying a mild sub-Gaussian-tail condition. Our bound on the dimension is nearly optimal. Additionally, our result applies to Euclidean k-clustering with the distances raised to the p-th power for any constant p. For k-means, our result resolves an open problem posed by Cohen, Elder, Musco, Musco, and Persu (STOC 2015); for k-medians, it answers a question raised by Kannan. more »

Award ID(s):: 1718820

NSF-PAR ID:: 10107373

Author(s) / Creator(s):: Makarychev, Konstantin; Makarychev, Yury; Razenshteyn, Ilya

Date Published:: 2019-06-01

Journal Name:: Performance of Johnson-Lindenstrauss transform for k-means and k-medians clustering

Page Range / eLocation ID:: 1027-1038

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3313276.3316350

More Like this