Abstract
We study the problem of preclustering a set B of imprecise points in ℝ^d: we wish to cluster the regions specifying the potential locations of the points such that, no matter where the points are located within their regions, the resulting clustering approximates the optimal clustering for those locations. We consider kcenter, kmedian, and kmeans clustering, and obtain the following results.
Let B:={b₁,…,b_n} be a collection of disjoint balls in ℝ^d, where each ball b_i specifies the possible locations of an input point p_i. A partition ? of B into subsets is called an (f(k),α)preclustering (with respect to the specific kclustering variant under consideration) if (i) ? consists of f(k) preclusters, and (ii) for any realization P of the points p_i inside their respective balls, the cost of the clustering on P induced by ? is at most α times the cost of an optimal kclustering on P. We call f(k) the size of the preclustering and we call α its approximation ratio. We prove that, even in ℝ^1, one may need at least 3k3 preclusters to obtain a bounded approximation ratio  this holds for the kcenter, the kmedian, and the kmeans problem  and we present a (3k,1) preclustering for the kcenter problem in ℝ^1. We also present various preclusterings for balls in ℝ^d with d⩾2, including a (3k,α)preclustering with α≈13.9 for the kcenter and the kmedian problem, and α≈254.7 for the kmeans problem.
BibTeX  Entry
@InProceedings{abam_et_al:LIPIcs:2020:12250,
author = {Mohammad Ali Abam and Mark de Berg and Sina Farahzad and Mir Omid Haji Mirsadeghi and Morteza Saghafian},
title = {{Preclustering Algorithms for Imprecise Points}},
booktitle = {17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020)},
pages = {3:13:12},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959771504},
ISSN = {18688969},
year = {2020},
volume = {162},
editor = {Susanne Albers},
publisher = {Schloss DagstuhlLeibnizZentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2020/12250},
URN = {urn:nbn:de:0030drops122503},
doi = {10.4230/LIPIcs.SWAT.2020.3},
annote = {Keywords: Geometric clustering, kcenter, kmeans, kmedian, imprecise points, approximation algorithms}
}
Keywords: 

Geometric clustering, kcenter, kmeans, kmedian, imprecise points, approximation algorithms 
Collection: 

17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020) 
Issue Date: 

2020 
Date of publication: 

12.06.2020 