License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CPM.2017.7
URN: urn:nbn:de:0030-drops-73402
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2017/7340/
Go to the corresponding LIPIcs Volume Portal


Belazzougui, Djamal ; Cunial, Fabio

Representing the Suffix Tree with the CDAWG

pdf-format:
LIPIcs-CPM-2017-7.pdf (0.7 MB)


Abstract

Given a string T, it is known that its suffix tree can be represented using the compact directed acyclic word graph (CDAWG) with e_T arcs, taking overall O(e_T+e_REV(T)) words of space, where REV(T) is the reverse of T, and supporting some key operations in time between O(1) and O(log(log(n))) in the worst case. This representation is especially appealing for highly repetitive strings, like collections of similar genomes or of version-controlled documents, in which e_T grows sublinearly in the length of T in practice. In this paper we augment such representation, supporting a number of additional queries in worst-case time between O(1) and O(log(n)) in the RAM model, without increasing space complexity asymptotically. Our technique, based on a heavy path decomposition of the suffix tree, enables also a representation of the suffix array, of the inverse suffix array, and of T itself, that takes O(e_T) words of space, and that supports random access in O(log(n)) time. Furthermore, we establish a connection between the reversed CDAWG of T and a context-free grammar that produces T and only T, which might have independent interest.

BibTeX - Entry

@InProceedings{belazzougui_et_al:LIPIcs:2017:7340,
  author =	{Djamal Belazzougui and Fabio Cunial},
  title =	{{Representing the Suffix Tree with the CDAWG}},
  booktitle =	{28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)},
  pages =	{7:1--7:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-039-2},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{78},
  editor =	{Juha K{\"a}rkk{\"a}inen and Jakub Radoszewski and Wojciech Rytter},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2017/7340},
  URN =		{urn:nbn:de:0030-drops-73402},
  doi =		{10.4230/LIPIcs.CPM.2017.7},
  annote =	{Keywords: CDAWG, suffix tree, heavy path decomposition, maximal repeat, context-free grammar}
}

Keywords: CDAWG, suffix tree, heavy path decomposition, maximal repeat, context-free grammar
Collection: 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)
Issue Date: 2017
Date of publication: 30.06.2017


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI