License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CPM.2022.14
URN: urn:nbn:de:0030-drops-161414
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2022/16141/
Go to the corresponding LIPIcs Volume Portal


Oizumi, Tsubasa ; Kai, Takeshi ; Mieno, Takuya ; Inenaga, Shunsuke ; Arimura, Hiroki

Cartesian Tree Subsequence Matching

pdf-format:
LIPIcs-CPM-2022-14.pdf (1 MB)


Abstract

Park et al. [TCS 2020] observed that the similarity between two (numerical) strings can be captured by the Cartesian trees: The Cartesian tree of a string is a binary tree recursively constructed by picking up the smallest value of the string as the root of the tree. Two strings of equal length are said to Cartesian-tree match if their Cartesian trees are isomorphic. Park et al. [TCS 2020] introduced the following Cartesian tree substring matching (CTMStr) problem: Given a text string T of length n and a pattern string of length m, find every consecutive substring S = T[i..j] of a text string T such that S and P Cartesian-tree match. They showed how to solve this problem in Õ(n+m) time. In this paper, we introduce the Cartesian tree subsequence matching (CTMSeq) problem, that asks to find every minimal substring S = T[i..j] of T such that S contains a subsequence S' which Cartesian-tree matches P. We prove that the CTMSeq problem can be solved efficiently, in O(m n p(n)) time, where p(n) denotes the update/query time for dynamic predecessor queries. By using a suitable dynamic predecessor data structure, we obtain O(mn log log n)-time and O(n log m)-space solution for CTMSeq. This contrasts CTMSeq with closely related order-preserving subsequence matching (OPMSeq) which was shown to be NP-hard by Bose et al. [IPL 1998].

BibTeX - Entry

@InProceedings{oizumi_et_al:LIPIcs.CPM.2022.14,
  author =	{Oizumi, Tsubasa and Kai, Takeshi and Mieno, Takuya and Inenaga, Shunsuke and Arimura, Hiroki},
  title =	{{Cartesian Tree Subsequence Matching}},
  booktitle =	{33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)},
  pages =	{14:1--14:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-234-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{223},
  editor =	{Bannai, Hideo and Holub, Jan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2022/16141},
  URN =		{urn:nbn:de:0030-drops-161414},
  doi =		{10.4230/LIPIcs.CPM.2022.14},
  annote =	{Keywords: string algorithms, pattern matching, Cartesian tree subsequence matching, order preserving matching, episode matching}
}

Keywords: string algorithms, pattern matching, Cartesian tree subsequence matching, order preserving matching, episode matching
Collection: 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)
Issue Date: 2022
Date of publication: 22.06.2022


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI