License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CPM.2016.2
URN: urn:nbn:de:0030-drops-60736
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2016/6073/
Go to the corresponding LIPIcs Volume Portal


Ganguly, Arnab ; Hon, Wing-Kai ; Sadakane, Kunihiko ; Shah, Rahul ; Thankachan, Sharma V. ; Yang, Yilin

Space-Efficient Dictionaries for Parameterized and Order-Preserving Pattern Matching

pdf-format:
LIPIcs-CPM-2016-2.pdf (0.5 MB)


Abstract

Let S and S' be two strings of the same length.We consider the following two variants of string matching.


* Parameterized Matching: The characters of S and S' are partitioned into static characters and parameterized characters.
The strings are parameterized match iff the static characters match exactly and there exists a one-to-one function which renames the parameterized characters in S to those in S'.

* Order-Preserving Matching: The strings are order-preserving match iff for any two integers i,j in [1,|S|], S[i] <= S[j] iff S'[i] <= S'[j].

Let P be a collection of d patterns {P_1, P_2, ..., P_d} of total length n characters, which are chosen from an alphabet Sigma.
Given a text T, also over Sigma, we consider the dictionary indexing problem under the above definitions of string matching.
Specifically, the task is to index P, such that we can report all positions j where at least one of the patterns P_i in P is a parameterized-match (resp. order-preserving match) with the same-length substring of $T$ starting at j. Previous best-known indexes occupy O(n * log(n)) bits and can report all occ positions in O(|T| * log(|Sigma|) + occ) time. We present space-efficient indexes that occupy O(n * log(|Sigma|+d) * log(n)) bits and reports all occ positions in O(|T| * (log(|Sigma|) + log_{|Sigma|}(n)) + occ) time for parameterized matching and in O(|T| * log(n) + occ) time for order-preserving matching.

BibTeX - Entry

@InProceedings{ganguly_et_al:LIPIcs:2016:6073,
  author =	{Arnab Ganguly and Wing-Kai Hon and Kunihiko Sadakane and Rahul Shah and Sharma V. Thankachan and Yilin Yang},
  title =	{{Space-Efficient Dictionaries for Parameterized and Order-Preserving Pattern Matching}},
  booktitle =	{27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)},
  pages =	{2:1--2:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-012-5},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{54},
  editor =	{Roberto Grossi and Moshe Lewenstein},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2016/6073},
  URN =		{urn:nbn:de:0030-drops-60736},
  doi =		{10.4230/LIPIcs.CPM.2016.2},
  annote =	{Keywords: Parameterized Matching, Order-preserving Matching, Dictionary Indexing, Aho-Corasick Automaton, Sparsification}
}

Keywords: Parameterized Matching, Order-preserving Matching, Dictionary Indexing, Aho-Corasick Automaton, Sparsification
Collection: 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)
Issue Date: 2016
Date of publication: 27.06.2016


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI