License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CPM.2022.3
URN: urn:nbn:de:0030-drops-161300
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2022/16130/
Go to the corresponding LIPIcs Volume Portal


Thankachan, Sharma V.

Compact Text Indexing for Advanced Pattern Matching Problems: Parameterized, Order-Isomorphic, 2D, etc. (Invited Talk)

pdf-format:
LIPIcs-CPM-2022-3.pdf (0.3 MB)


Abstract

In the past two decades, we have witnessed the design of various compact data structures for pattern matching over an indexed text [Navarro, 2016]. Popular indexes like the FM-index [Paolo Ferragina and Giovanni Manzini, 2005], compressed suffix arrays/trees [Roberto Grossi and Jeffrey Scott Vitter, 2005; Kunihiko Sadakane, 2007], the recent r-index [Travis Gagie et al., 2020; Takaaki Nishimoto and Yasuo Tabei, 2021], etc., capture the key functionalities of classic suffix arrays/trees [Udi Manber and Eugene W. Myers, 1993; Peter Weiner, 1973] in compact space. Mostly, they rely on the Burrows-Wheeler Transform (BWT) and its associated operations [Burrows and Wheeler, 1994]. However, compactly encoding some advanced suffix tree (ST) variants, like parameterized ST [Brenda S. Baker, 1993; S. Rao Kosaraju, 1995; Juan Mendivelso et al., 2020], order-isomorphic/preserving ST [Maxime Crochemore et al., 2016], two-dimensional ST [Raffaele Giancarlo, 1995; Dong Kyue Kim et al., 1998], etc. [Sung Gwan Park et al., 2019; Tetsuo Shibuya, 2000]- collectively known as suffix trees with missing suffix links [Richard Cole and Ramesh Hariharan, 2003], has been challenging. The previous techniques are not easily extendable because these variants do not hold some structural properties of the standard ST that enable compression. However, some limited progress has been made in these directions recently [Arnab Ganguly et al., 2017; Travis Gagie et al., 2017; Gianni Decaroli et al., 2017; Dhrumil Patel and Rahul Shah, 2021; Arnab Ganguly et al., 2021; Sung{-}Hwan Kim and Hwan{-}Gue Cho, 2021; Sung{-}Hwan Kim and Hwan{-}Gue Cho, 2021; Arnab Ganguly et al., 2017; Arnab Ganguly et al., 2022; Arnab Ganguly et al., 2021]. This talk will briefly survey them and highlight some interesting open problems.

BibTeX - Entry

@InProceedings{thankachan:LIPIcs.CPM.2022.3,
  author =	{Thankachan, Sharma V.},
  title =	{{Compact Text Indexing for Advanced Pattern Matching Problems: Parameterized, Order-Isomorphic, 2D, etc.}},
  booktitle =	{33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)},
  pages =	{3:1--3:3},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-234-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{223},
  editor =	{Bannai, Hideo and Holub, Jan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2022/16130},
  URN =		{urn:nbn:de:0030-drops-161300},
  doi =		{10.4230/LIPIcs.CPM.2022.3},
  annote =	{Keywords: Text Indexing, Suffix Trees, String Matching}
}

Keywords: Text Indexing, Suffix Trees, String Matching
Collection: 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)
Issue Date: 2022
Date of publication: 22.06.2022


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI