License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CPM.2020.14
URN: urn:nbn:de:0030-drops-121393
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2020/12139/
Go to the corresponding LIPIcs Volume Portal


Gawrychowski, Paweł ; Ghazawi, Samah ; Landau, Gad M.

On Indeterminate Strings Matching

pdf-format:
LIPIcs-CPM-2020-14.pdf (1.0 MB)


Abstract

Given two indeterminate equal-length strings p and t with a set of characters per position in both strings, we obtain a determinate string p_w from p and a determinate string t_w from t by choosing one character per position. Then, we say that p and t match when p_w and t_w match for some choice of the characters. While the most standard notion of a match for determinate strings is that they are simply identical, in certain applications it is more appropriate to use other definitions, with the prime examples being parameterized matching, order-preserving matching, and the recently introduced Cartesian tree matching. We provide a systematic study of the complexity of string matching for indeterminate equal-length strings, for different notions of matching. We use n to denote the length of both strings, and r to be an upper-bound on the number of uncertain characters per position. First, we provide the first polynomial time algorithm for the Cartesian tree version that runs in deterministic ?(nlog² n) and expected ?(nlog nlog log n) time using ?(nlog n) space, for constant r. Second, we establish NP-hardness of the order-preserving version for r=2, thus solving a question explicitly stated by Henriques et al. [CPM 2018], who showed hardness for r=3. Third, we establish NP-hardness of the parameterized version for r=2. As both parameterized and order-preserving indeterminate matching reduce to the standard determinate matching for r=1, this provides a complete classification for these three variants.

BibTeX - Entry

@InProceedings{gawrychowski_et_al:LIPIcs:2020:12139,
  author =	{Paweł Gawrychowski and Samah Ghazawi and Gad M. Landau},
  title =	{{On Indeterminate Strings Matching}},
  booktitle =	{31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)},
  pages =	{14:1--14:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-149-8},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{161},
  editor =	{Inge Li G{\o}rtz and Oren Weimann},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2020/12139},
  URN =		{urn:nbn:de:0030-drops-121393},
  doi =		{10.4230/LIPIcs.CPM.2020.14},
  annote =	{Keywords: string matching, indeterminate strings, Cartesian trees, order-preserving matching, parameterized matching}
}

Keywords: string matching, indeterminate strings, Cartesian trees, order-preserving matching, parameterized matching
Collection: 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)
Issue Date: 2020
Date of publication: 09.06.2020


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI