License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CPM.2016.21
URN: urn:nbn:de:0030-drops-60720
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2016/6072/
Go to the corresponding LIPIcs Volume Portal


Starikovskaya, Tatiana

Longest Common Substring with Approximately k Mismatches

pdf-format:
LIPIcs-CPM-2016-21.pdf (0.4 MB)


Abstract

In the longest common substring problem we are given two strings of length n and must find a substring of maximal length that occurs in both strings. It is well-known that the problem can be solved in linear time, but the solution is not robust and can vary greatly when the input strings are changed even by one letter. To circumvent this, Leimester and Morgenstern introduced the problem of the longest common substring with k mismatches. Lately, this problem has received a lot of attention in the literature, and several algorithms have been suggested. The running time of these algorithms is n^{2-o(1)}, and unfortunately, conditional lower bounds have been shown which imply that there is little hope to improve this bound.

In this paper we study a different but closely related problem of the longest common substring with approximately k mismatches and use computational geometry techniques to show that it admits a randomised solution with strongly subquadratic running time.

BibTeX - Entry

@InProceedings{starikovskaya:LIPIcs:2016:6072,
  author =	{Tatiana Starikovskaya},
  title =	{{Longest Common Substring with Approximately k Mismatches}},
  booktitle =	{27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)},
  pages =	{21:1--21:11},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-012-5},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{54},
  editor =	{Roberto Grossi and Moshe Lewenstein},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2016/6072},
  URN =		{urn:nbn:de:0030-drops-60720},
  doi =		{10.4230/LIPIcs.CPM.2016.21},
  annote =	{Keywords: Randomised algorithms, string similarity measures, longest common substring, sketching, locality-sensitive hashing}
}

Keywords: Randomised algorithms, string similarity measures, longest common substring, sketching, locality-sensitive hashing
Collection: 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)
Issue Date: 2016
Date of publication: 27.06.2016


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI