1-mean and 1-medoid 2-clustering problem with arbitrary cluster sizes: complexity and approximation


Artem V Pyatkin




We consider the following 2-clustering problem. Given N points in Euclidean space, partition it into two subsets (clusters) so that the sum of squared distances between the elements of the clusters and their centers would be minimum. The center of the first cluster coincides with its centroid (mean) while the center of the second cluster should be chosen from the set of the initial points (medoid). It is known that this problem is NP-hard if the cardinalities of the clusters are given as a part of the input. In this paper we prove that the problem remains NP-hard in the case of arbitrary clusters sizes and suggest a 2-approximation polynomial-time algorithm for this problem.