License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.ESA.2023.55
URN: urn:nbn:de:0030-drops-187080
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2023/18708/
Grunau, Christoph ;
Özüdoğru, Ahmet Alper ;
Rozhoň, Václav
Noisy k-Means++ Revisited
Abstract
The k-means++ algorithm by Arthur and Vassilvitskii [SODA 2007] is a classical and time-tested algorithm for the k-means problem. While being very practical, the algorithm also has good theoretical guarantees: its solution is O(log k)-approximate, in expectation.
In a recent work, Bhattacharya, Eube, Roglin, and Schmidt [ESA 2020] considered the following question: does the algorithm retain its guarantees if we allow for a slight adversarial noise in the sampling probability distributions used by the algorithm? This is motivated e.g. by the fact that computations with real numbers in k-means++ implementations are inexact. Surprisingly, the analysis under this scenario gets substantially more difficult and the authors were able to prove only a weaker approximation guarantee of O(log² k). In this paper, we close the gap by providing a tight, O(log k)-approximate guarantee for the k-means++ algorithm with noise.
BibTeX - Entry
@InProceedings{grunau_et_al:LIPIcs.ESA.2023.55,
author = {Grunau, Christoph and \"{O}z\"{u}do\u{g}ru, Ahmet Alper and Rozho\v{n}, V\'{a}clav},
title = {{Noisy k-Means++ Revisited}},
booktitle = {31st Annual European Symposium on Algorithms (ESA 2023)},
pages = {55:1--55:7},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-295-2},
ISSN = {1868-8969},
year = {2023},
volume = {274},
editor = {G{\o}rtz, Inge Li and Farach-Colton, Martin and Puglisi, Simon J. and Herman, Grzegorz},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2023/18708},
URN = {urn:nbn:de:0030-drops-187080},
doi = {10.4230/LIPIcs.ESA.2023.55},
annote = {Keywords: clustering, k-means, k-means++, adversarial noise}
}
Keywords: |
|
clustering, k-means, k-means++, adversarial noise |
Collection: |
|
31st Annual European Symposium on Algorithms (ESA 2023) |
Issue Date: |
|
2023 |
Date of publication: |
|
30.08.2023 |