License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.ESA.2021.35
URN: urn:nbn:de:0030-drops-146167
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2021/14616/
Go to the corresponding LIPIcs Volume Portal


Cygan, Marek ; Kulikov, Alexander S. ; Mihajlin, Ivan ; Nikolaev, Maksim ; Reznikov, Grigory

Minimum Common String Partition: Exact Algorithms

pdf-format:
LIPIcs-ESA-2021-35.pdf (0.7 MB)


Abstract

In the minimum common string partition problem (MCSP), one gets two strings and is asked to find the minimum number of cuts in the first string such that the second string can be obtained by rearranging the resulting pieces. It is a difficult algorithmic problem having applications in computational biology, text processing, and data compression. MCSP has been studied extensively from various algorithmic angles: there are many papers studying approximation, heuristic, and parameterized algorithms. At the same time, almost nothing is known about its exact complexity. In this paper, we present new results in this direction. We improve the known 2ⁿ upper bound (where n is the length of input strings) to ϕⁿ where ϕ ≈ 1.618... is the golden ratio. The algorithm uses Fibonacci numbers to encode subsets as monomials of a certain implicit polynomial and extracts one of its coefficients using the fast Fourier transform. Then, we show that the case of constant size alphabet can be solved in subexponential time 2^{O(nlog log n/log n)} by a hybrid strategy: enumerate all long pieces and use dynamic programming over histograms of short pieces. Finally, we prove almost matching lower bounds assuming the Exponential Time Hypothesis.

BibTeX - Entry

@InProceedings{cygan_et_al:LIPIcs.ESA.2021.35,
  author =	{Cygan, Marek and Kulikov, Alexander S. and Mihajlin, Ivan and Nikolaev, Maksim and Reznikov, Grigory},
  title =	{{Minimum Common String Partition: Exact Algorithms}},
  booktitle =	{29th Annual European Symposium on Algorithms (ESA 2021)},
  pages =	{35:1--35:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-204-4},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{204},
  editor =	{Mutzel, Petra and Pagh, Rasmus and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2021/14616},
  URN =		{urn:nbn:de:0030-drops-146167},
  doi =		{10.4230/LIPIcs.ESA.2021.35},
  annote =	{Keywords: similarity measure, string distance, exact algorithms, upper bounds, lower bounds}
}

Keywords: similarity measure, string distance, exact algorithms, upper bounds, lower bounds
Collection: 29th Annual European Symposium on Algorithms (ESA 2021)
Issue Date: 2021
Date of publication: 31.08.2021


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI