License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.ESA.2019.28
URN: urn:nbn:de:0030-drops-111495
Go to the corresponding LIPIcs Volume Portal

Chen, Jiehua ; Hermelin, Danny ; Sorge, Manuel

On Computing Centroids According to the p-Norms of Hamming Distance Vectors

LIPIcs-ESA-2019-28.pdf (0.8 MB)


In this paper we consider the p-Norm Hamming Centroid problem which asks to determine whether some given strings have a centroid with a bound on the p-norm of its Hamming distances to the strings. Specifically, given a set S of strings and a real k, we consider the problem of determining whether there exists a string s^* with (sum_{s in S} d^{p}(s^*,s))^(1/p) <=k, where d(,) denotes the Hamming distance metric. This problem has important applications in data clustering and multi-winner committee elections, and is a generalization of the well-known polynomial-time solvable Consensus String (p=1) problem, as well as the NP-hard Closest String (p=infty) problem.
Our main result shows that the problem is NP-hard for all fixed rational p > 1, closing the gap for all rational values of p between 1 and infty. Under standard complexity assumptions the reduction also implies that the problem has no 2^o(n+m)-time or 2^o(k^(p/(p+1)))-time algorithm, where m denotes the number of input strings and n denotes the length of each string, for any fixed p > 1. The first bound matches a straightforward brute-force algorithm. The second bound is tight in the sense that for each fixed epsilon > 0, we provide a 2^(k^(p/((p+1))+epsilon))-time algorithm. In the last part of the paper, we complement our hardness result by presenting a fixed-parameter algorithm and a factor-2 approximation algorithm for the problem.

BibTeX - Entry

  author =	{Jiehua Chen and Danny Hermelin and Manuel Sorge},
  title =	{{On Computing Centroids According to the p-Norms of Hamming Distance Vectors}},
  booktitle =	{27th Annual European Symposium on Algorithms (ESA 2019)},
  pages =	{28:1--28:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-124-5},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{144},
  editor =	{Michael A. Bender and Ola Svensson and Grzegorz Herman},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{},
  URN =		{urn:nbn:de:0030-drops-111495},
  doi =		{10.4230/LIPIcs.ESA.2019.28},
  annote =	{Keywords: Strings, Clustering, Multiwinner Election, Hamming Distance}

Keywords: Strings, Clustering, Multiwinner Election, Hamming Distance
Collection: 27th Annual European Symposium on Algorithms (ESA 2019)
Issue Date: 2019
Date of publication: 06.09.2019

DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI