License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CPM.2017.24
URN: urn:nbn:de:0030-drops-73460
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2017/7346/
Mieno, Takuya ;
Inenaga, Shunsuke ;
Bannai, Hideo ;
Takeda, Masayuki
Tight Bounds on the Maximum Number of Shortest Unique Substrings
Abstract
A substring Q of a string S is called a shortest unique substring (SUS) for interval [s,t] in S, if Q occurs exactly once in S, this occurrence of Q contains interval [s,t], and every substring of S which contains interval [s,t] and is shorter than Q occurs at least twice in S. The SUS problem is, given a string S, to preprocess S so that for any subsequent query interval [s,t] all the SUSs for interval [s,t] can be answered quickly. When s = t, we call the SUSs for [s, t] as point SUSs, and when s <= t, we call the SUSs for [s, t] as interval SUSs. There exist optimal O(n)-time preprocessing scheme which answers queries in optimal O(k) time for both point and interval SUSs, where n is the length of S and k is the number of outputs for a given query. In this paper, we reveal structural, combinatorial properties underlying the SUS problem: Namely, we show that the number of intervals in S that correspond to point SUSs for all query positions in S is less than 1.5n, and show that this is a matching upper and lower bound. Also, we consider the maximum number of intervals in S that correspond to interval SUSs for all query intervals in S.
BibTeX - Entry
@InProceedings{mieno_et_al:LIPIcs:2017:7346,
author = {Takuya Mieno and Shunsuke Inenaga and Hideo Bannai and Masayuki Takeda},
title = {{Tight Bounds on the Maximum Number of Shortest Unique Substrings}},
booktitle = {28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)},
pages = {24:1--24:11},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-039-2},
ISSN = {1868-8969},
year = {2017},
volume = {78},
editor = {Juha K{\"a}rkk{\"a}inen and Jakub Radoszewski and Wojciech Rytter},
publisher = {Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
address = {Dagstuhl, Germany},
URL = {http://drops.dagstuhl.de/opus/volltexte/2017/7346},
URN = {urn:nbn:de:0030-drops-73460},
doi = {10.4230/LIPIcs.CPM.2017.24},
annote = {Keywords: shortest unique substrings, maximal unique substrings}
}
Keywords: |
|
shortest unique substrings, maximal unique substrings |
Collection: |
|
28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017) |
Issue Date: |
|
2017 |
Date of publication: |
|
30.06.2017 |