License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.ICALP.2021.101
URN: urn:nbn:de:0030-drops-141702
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2021/14170/
Nishimoto, Takaaki ;
Tabei, Yasuo
Optimal-Time Queries on BWT-Runs Compressed Indexes
Abstract
Indexing highly repetitive strings (i.e., strings with many repetitions) for fast queries has become a central research topic in string processing, because it has a wide variety of applications in bioinformatics and natural language processing. Although a substantial number of indexes for highly repetitive strings have been proposed thus far, developing compressed indexes that support various queries remains a challenge. The run-length Burrows-Wheeler transform (RLBWT) is a lossless data compression by a reversible permutation of an input string and run-length encoding, and it has received interest for indexing highly repetitive strings. LF and ϕ^{-1} are two key functions for building indexes on RLBWT, and the best previous result computes LF and ϕ^{-1} in O(log log n) time with O(r) words of space for the string length n and the number r of runs in RLBWT. In this paper, we improve LF and ϕ^{-1} so that they can be computed in a constant time with O(r) words of space. Subsequently, we present OptBWTR (optimal-time queries on BWT-runs compressed indexes), the first string index that supports various queries including locate, count, extract queries in optimal time and O(r) words of space.
BibTeX - Entry
@InProceedings{nishimoto_et_al:LIPIcs.ICALP.2021.101,
author = {Nishimoto, Takaaki and Tabei, Yasuo},
title = {{Optimal-Time Queries on BWT-Runs Compressed Indexes}},
booktitle = {48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)},
pages = {101:1--101:15},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-195-5},
ISSN = {1868-8969},
year = {2021},
volume = {198},
editor = {Bansal, Nikhil and Merelli, Emanuela and Worrell, James},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2021/14170},
URN = {urn:nbn:de:0030-drops-141702},
doi = {10.4230/LIPIcs.ICALP.2021.101},
annote = {Keywords: Compressed text indexes, Burrows-Wheeler transform, highly repetitive text collections}
}
Keywords: |
|
Compressed text indexes, Burrows-Wheeler transform, highly repetitive text collections |
Collection: |
|
48th International Colloquium on Automata, Languages, and Programming (ICALP 2021) |
Issue Date: |
|
2021 |
Date of publication: |
|
02.07.2021 |