License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.ESA.2018.33
URN: urn:nbn:de:0030-drops-94963
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2018/9496/
Go to the corresponding LIPIcs Volume Portal


Ganczorz, Michal ; Gawrychowski, Pawel ; Jez, Artur ; Kociumaka, Tomasz

Edit Distance with Block Operations

pdf-format:
LIPIcs-ESA-2018-33.pdf (0.4 MB)


Abstract

We consider the problem of edit distance in which block operations are allowed, i.e. we ask for the minimal number of (block) operations that are needed to transform a string s to t. We give O(log n) approximation algorithms, where n is the total length of the input strings, for the variants of the problem which allow the following sets of operations: block move; block move and block delete; block move and block copy; block move, block copy, and block uncopy. The results still hold if we additionally allow any of the following operations: character insert, character delete, block reversal, or block involution (involution is a generalisation of the reversal). Previously, algorithms only for the first and last variant were known, and they had approximation ratios O(log n log^*n) and O(log n (log^*n)^2), respectively. The edit distance with block moves is equivalent, up to a constant factor, to the common string partition problem, in which we are given two strings s, t and the goal is to partition s into minimal number of parts such that they can be permuted in order to obtain t. Thus we also obtain an O(log n) approximation for this problem (compared to the previous O(log n log^* n)).
The results use a simplification of the previously used technique of locally consistent parsing, which groups short substrings of a string into phrases so that similar substrings are guaranteed to be grouped in a similar way. Instead of a sophisticated parsing technique relying on a deterministic coin tossing, we use a simple one based on a partition of the alphabet into two subalphabets. In particular, this lowers the running time from O(n log^* n) to O(n). The new algorithms (for block copy or block delete) use a similar algorithm, but the analysis is based on a specially tuned combinatorial function on sets of numbers.

BibTeX - Entry

@InProceedings{ganczorz_et_al:LIPIcs:2018:9496,
  author =	{Michal Ganczorz and Pawel Gawrychowski and Artur Jez and Tomasz Kociumaka},
  title =	{{Edit Distance with Block Operations}},
  booktitle =	{26th Annual European Symposium on Algorithms (ESA 2018)},
  pages =	{33:1--33:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-081-1},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{112},
  editor =	{Yossi Azar and Hannah Bast and Grzegorz Herman},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2018/9496},
  URN =		{urn:nbn:de:0030-drops-94963},
  doi =		{10.4230/LIPIcs.ESA.2018.33},
  annote =	{Keywords: Edit distance, Block operations, Common string partition}
}

Keywords: Edit distance, Block operations, Common string partition
Collection: 26th Annual European Symposium on Algorithms (ESA 2018)
Issue Date: 2018
Date of publication: 14.08.2018


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI