The k-means problem is a classic objective for modeling clustering in a metric space. Given a set of points in a metric space, the goal is to find k representative points so as to minimize the sum of the squared distances from each point to its closest representative. In this work, we study the approximability of k-means in Euclidean spaces parameterized by the number of clusters, k. In seminal works, de la Vega, Karpinski, Kenyon, and Rabani STOC'03 and Kumar, Sabharwal, and Sen JACM'10 showed how to obtain a (1+ε) -approximation for high-dimensional Euclidean k-means in time 2^ (k/ε) O (1) ⋅ dnO (1). In this work, we introduce a new fine-grained hypothesis called Exponential Time for Expanders Hypothesis (XXH) which roughly asserts that there are no non-trivial exponential time approximation algorithms for the vertex cover problem on near perfect vertex expanders. Assuming XXH, we close the above long line of work on approximating Euclidean k-means by showing that there is no 2^ (k/ε) ^{1-o (1) } ⋅ nO (1) time algorithm achieving a (1+ε) -approximation for k-means in Euclidean space. This lower bound is tight as it matches the algorithm given by Feldman, Monemizadeh, and Sohler SoCG'07 whose runtime is 2O (k/ε) + O (ndk). Furthermore, assuming XXH, we show that the seminal O (n^kd+1) runtime exact algorithm of Inaba, Katoh, and Imai SoCG'94 for k-means is optimal for small values of k.
Cohen-Addad et al. (Thu,) studied this question.