License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CPM.2021.20
URN: urn:nbn:de:0030-drops-139717
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2021/13971/
Go to the corresponding LIPIcs Volume Portal


Nellore, Abhinav ; Nguyen, Austin ; Thompson, Reid F.

An Invertible Transform for Efficient String Matching in Labeled Digraphs

pdf-format:
LIPIcs-CPM-2021-20.pdf (2 MB)


Abstract

Let G = (V, E) be a digraph where each vertex is unlabeled, each edge is labeled by a character in some alphabet Ω, and any two edges with both the same head and the same tail have different labels. The powerset construction gives a transform of G into a weakly connected digraph G' = (V', E') that enables solving the decision problem of whether there is a walk in G matching an arbitrarily long query string q in time linear in |q| and independent of |E| and |V|. We show G is uniquely determined by G' when for every v_? ∈ V, there is some distinct string s_? on Ω such that v_? is the origin of a closed walk in G matching s_?, and no other walk in G matches s_? unless it starts and ends at v_?. We then exploit this invertibility condition to strategically alter any G so its transform G' enables retrieval of all t terminal vertices of walks in the unaltered G matching q in O(|q| + t log |V|) time. We conclude by proposing two defining properties of a class of transforms that includes the Burrows-Wheeler transform and the transform presented here.

BibTeX - Entry

@InProceedings{nellore_et_al:LIPIcs.CPM.2021.20,
  author =	{Nellore, Abhinav and Nguyen, Austin and Thompson, Reid F.},
  title =	{{An Invertible Transform for Efficient String Matching in Labeled Digraphs}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{20:1--20:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2021/13971},
  URN =		{urn:nbn:de:0030-drops-139717},
  doi =		{10.4230/LIPIcs.CPM.2021.20},
  annote =	{Keywords: pattern matching, string matching, Burrows-Wheeler transform, labeled graphs}
}

Keywords: pattern matching, string matching, Burrows-Wheeler transform, labeled graphs
Collection: 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)
Issue Date: 2021
Date of publication: 30.06.2021


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI