Abstract
Given two indeterminate equallength strings p and t with a set of characters per position in both strings, we obtain a determinate string p_w from p and a determinate string t_w from t by choosing one character per position. Then, we say that p and t match when p_w and t_w match for some choice of the characters. While the most standard notion of a match for determinate strings is that they are simply identical, in certain applications it is more appropriate to use other definitions, with the prime examples being parameterized matching, orderpreserving matching, and the recently introduced Cartesian tree matching. We provide a systematic study of the complexity of string matching for indeterminate equallength strings, for different notions of matching. We use n to denote the length of both strings, and r to be an upperbound on the number of uncertain characters per position. First, we provide the first polynomial time algorithm for the Cartesian tree version that runs in deterministic ?(nlog² n) and expected ?(nlog nlog log n) time using ?(nlog n) space, for constant r. Second, we establish NPhardness of the orderpreserving version for r=2, thus solving a question explicitly stated by Henriques et al. [CPM 2018], who showed hardness for r=3. Third, we establish NPhardness of the parameterized version for r=2. As both parameterized and orderpreserving indeterminate matching reduce to the standard determinate matching for r=1, this provides a complete classification for these three variants.
BibTeX  Entry
@InProceedings{gawrychowski_et_al:LIPIcs:2020:12139,
author = {Paweł Gawrychowski and Samah Ghazawi and Gad M. Landau},
title = {{On Indeterminate Strings Matching}},
booktitle = {31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)},
pages = {14:114:14},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959771498},
ISSN = {18688969},
year = {2020},
volume = {161},
editor = {Inge Li G{\o}rtz and Oren Weimann},
publisher = {Schloss DagstuhlLeibnizZentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2020/12139},
URN = {urn:nbn:de:0030drops121393},
doi = {10.4230/LIPIcs.CPM.2020.14},
annote = {Keywords: string matching, indeterminate strings, Cartesian trees, orderpreserving matching, parameterized matching}
}
Keywords: 

string matching, indeterminate strings, Cartesian trees, orderpreserving matching, parameterized matching 
Collection: 

31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020) 
Issue Date: 

2020 
Date of publication: 

09.06.2020 