License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CPM.2020.26
URN: urn:nbn:de:0030-drops-121512
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2020/12151/
Nakashima, Katsuhito ;
Fujisato, Noriki ;
Hendrian, Diptarama ;
Nakashima, Yuto ;
Yoshinaka, Ryo ;
Inenaga, Shunsuke ;
Bannai, Hideo ;
Shinohara, Ayumi ;
Takeda, Masayuki
DAWGs for Parameterized Matching: Online Construction and Related Indexing Structures
Abstract
Two strings x and y over Σ ∪ Π of equal length are said to parameterized match (p-match) if there is a renaming bijection f:Σ ∪ Π → Σ ∪ Π that is identity on Σ and transforms x to y (or vice versa). The p-matching problem is to look for substrings in a text that p-match a given pattern. In this paper, we propose parameterized suffix automata (p-suffix automata) and parameterized directed acyclic word graphs (PDAWGs) which are the p-matching versions of suffix automata and DAWGs. While suffix automata and DAWGs are equivalent for standard strings, we show that p-suffix automata can have Θ(n²) nodes and edges but PDAWGs have only O(n) nodes and edges, where n is the length of an input string. We also give O(n |Π| log (|Π| + |Σ|))-time O(n)-space algorithm that builds the PDAWG in a left-to-right online manner. As a byproduct, it is shown that the parameterized suffix tree for the reversed string can also be built in the same time and space, in a right-to-left online manner.
BibTeX - Entry
@InProceedings{nakashima_et_al:LIPIcs:2020:12151,
author = {Katsuhito Nakashima and Noriki Fujisato and Diptarama Hendrian and Yuto Nakashima and Ryo Yoshinaka and Shunsuke Inenaga and Hideo Bannai and Ayumi Shinohara and Masayuki Takeda},
title = {{DAWGs for Parameterized Matching: Online Construction and Related Indexing Structures}},
booktitle = {31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)},
pages = {26:1--26:14},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-149-8},
ISSN = {1868-8969},
year = {2020},
volume = {161},
editor = {Inge Li G{\o}rtz and Oren Weimann},
publisher = {Schloss Dagstuhl--Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2020/12151},
URN = {urn:nbn:de:0030-drops-121512},
doi = {10.4230/LIPIcs.CPM.2020.26},
annote = {Keywords: parameterized matching, suffix trees, DAWGs, suffix automata}
}
Keywords: |
|
parameterized matching, suffix trees, DAWGs, suffix automata |
Collection: |
|
31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020) |
Issue Date: |
|
2020 |
Date of publication: |
|
09.06.2020 |