License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CPM.2020.14
URN: urn:nbn:de:0030-drops-121393
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2020/12139/
Gawrychowski, Paweł ;
Ghazawi, Samah ;
Landau, Gad M.
On Indeterminate Strings Matching
Abstract
Given two indeterminate equal-length strings p and t with a set of characters per position in both strings, we obtain a determinate string p_w from p and a determinate string t_w from t by choosing one character per position. Then, we say that p and t match when p_w and t_w match for some choice of the characters. While the most standard notion of a match for determinate strings is that they are simply identical, in certain applications it is more appropriate to use other definitions, with the prime examples being parameterized matching, order-preserving matching, and the recently introduced Cartesian tree matching. We provide a systematic study of the complexity of string matching for indeterminate equal-length strings, for different notions of matching. We use n to denote the length of both strings, and r to be an upper-bound on the number of uncertain characters per position. First, we provide the first polynomial time algorithm for the Cartesian tree version that runs in deterministic ?(nlog² n) and expected ?(nlog nlog log n) time using ?(nlog n) space, for constant r. Second, we establish NP-hardness of the order-preserving version for r=2, thus solving a question explicitly stated by Henriques et al. [CPM 2018], who showed hardness for r=3. Third, we establish NP-hardness of the parameterized version for r=2. As both parameterized and order-preserving indeterminate matching reduce to the standard determinate matching for r=1, this provides a complete classification for these three variants.
BibTeX - Entry
@InProceedings{gawrychowski_et_al:LIPIcs:2020:12139,
author = {Paweł Gawrychowski and Samah Ghazawi and Gad M. Landau},
title = {{On Indeterminate Strings Matching}},
booktitle = {31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)},
pages = {14:1--14:14},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-149-8},
ISSN = {1868-8969},
year = {2020},
volume = {161},
editor = {Inge Li G{\o}rtz and Oren Weimann},
publisher = {Schloss Dagstuhl--Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2020/12139},
URN = {urn:nbn:de:0030-drops-121393},
doi = {10.4230/LIPIcs.CPM.2020.14},
annote = {Keywords: string matching, indeterminate strings, Cartesian trees, order-preserving matching, parameterized matching}
}
Keywords: |
|
string matching, indeterminate strings, Cartesian trees, order-preserving matching, parameterized matching |
Collection: |
|
31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020) |
Issue Date: |
|
2020 |
Date of publication: |
|
09.06.2020 |