License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CPM.2018.3
URN: urn:nbn:de:0030-drops-87049
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2018/8704/
Baier, Uwe
On Undetected Redundancy in the Burrows-Wheeler Transform
Abstract
The Burrows-Wheeler-Transform (BWT) is an invertible permutation of a text known to be highly compressible but also useful for sequence analysis, what makes the BWT highly attractive for lossless data compression. In this paper, we present a new technique to reduce the size of a BWT using its combinatorial properties, while keeping it invertible. The technique can be applied to any BWT-based compressor, and, as experiments show, is able to reduce the encoding size by 8-16 % on average and up to 33-57 % in the best cases (depending on the BWT-compressor used), making BWT-based compressors competitive or even superior to today's best lossless compressors.
BibTeX - Entry
@InProceedings{baier:LIPIcs:2018:8704,
author = {Uwe Baier},
title = {{On Undetected Redundancy in the Burrows-Wheeler Transform}},
booktitle = {Annual Symposium on Combinatorial Pattern Matching (CPM 2018)},
pages = {3:1--3:15},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-074-3},
ISSN = {1868-8969},
year = {2018},
volume = {105},
editor = {Gonzalo Navarro and David Sankoff and Binhai Zhu},
publisher = {Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2018/8704},
URN = {urn:nbn:de:0030-drops-87049},
doi = {10.4230/LIPIcs.CPM.2018.3},
annote = {Keywords: Lossless data compression, BWT, Tunneling}
}
Keywords: |
|
Lossless data compression, BWT, Tunneling |
Collection: |
|
Annual Symposium on Combinatorial Pattern Matching (CPM 2018) |
Issue Date: |
|
2018 |
Date of publication: |
|
18.05.2018 |
Supplementary Material: |
|
Implementation available at https://github.com/waYne1337/tbwt |