License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CPM.2017.24
URN: urn:nbn:de:0030-drops-73460
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2017/7346/
Go to the corresponding LIPIcs Volume Portal


Mieno, Takuya ; Inenaga, Shunsuke ; Bannai, Hideo ; Takeda, Masayuki

Tight Bounds on the Maximum Number of Shortest Unique Substrings

pdf-format:
LIPIcs-CPM-2017-24.pdf (0.6 MB)


Abstract

A substring Q of a string S is called a shortest unique substring (SUS) for interval [s,t] in S, if Q occurs exactly once in S, this occurrence of Q contains interval [s,t], and every substring of S which contains interval [s,t] and is shorter than Q occurs at least twice in S. The SUS problem is, given a string S, to preprocess S so that for any subsequent query interval [s,t] all the SUSs for interval [s,t] can be answered quickly. When s = t, we call the SUSs for [s, t] as point SUSs, and when s <= t, we call the SUSs for [s, t] as interval SUSs. There exist optimal O(n)-time preprocessing scheme which answers queries in optimal O(k) time for both point and interval SUSs, where n is the length of S and k is the number of outputs for a given query. In this paper, we reveal structural, combinatorial properties underlying the SUS problem: Namely, we show that the number of intervals in S that correspond to point SUSs for all query positions in S is less than 1.5n, and show that this is a matching upper and lower bound. Also, we consider the maximum number of intervals in S that correspond to interval SUSs for all query intervals in S.

BibTeX - Entry

@InProceedings{mieno_et_al:LIPIcs:2017:7346,
  author =	{Takuya Mieno and Shunsuke Inenaga and Hideo Bannai and Masayuki Takeda},
  title =	{{Tight Bounds on the Maximum Number of Shortest Unique Substrings}},
  booktitle =	{28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)},
  pages =	{24:1--24:11},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-039-2},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{78},
  editor =	{Juha K{\"a}rkk{\"a}inen and Jakub Radoszewski and Wojciech Rytter},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2017/7346},
  URN =		{urn:nbn:de:0030-drops-73460},
  doi =		{10.4230/LIPIcs.CPM.2017.24},
  annote =	{Keywords: shortest unique substrings, maximal unique substrings}
}

Keywords: shortest unique substrings, maximal unique substrings
Collection: 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)
Issue Date: 2017
Date of publication: 30.06.2017


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI