License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CPM.2023.25
URN: urn:nbn:de:0030-drops-179792
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2023/17979/
Navarro, Gonzalo ;
Urbina, Cristian
L-Systems for Measuring Repetitiveness
Abstract
In order to use them for compression, we extend L-systems (without ε-rules) with two parameters d and n, and also a coding τ, which determines unambiguously a string w = τ(φ^d(s))[1:n], where φ is the morphism of the system, and s is its axiom. The length of the shortest description of an L-system generating w is known as ?, and it is arguably a relevant measure of repetitiveness that builds on the self-similarities that arise in the sequence.
In this paper, we deepen the study of the measure ? and its relation with a better-established measure called δ, which builds on substring complexity. Our results show that ? and δ are largely orthogonal, in the sense that one can be much larger than the other, depending on the case. This suggests that both mechanisms capture different kinds of regularities related to repetitiveness.
We then show that the recently introduced NU-systems, which combine the capabilities of L-systems with bidirectional macro schemes, can be asymptotically strictly smaller than both mechanisms for the same fixed string family, which makes the size ν of the smallest NU-system the unique smallest reachable repetitiveness measure to date. We conclude that in order to achieve better compression, we should combine morphism substitution with copy-paste mechanisms.
BibTeX - Entry
@InProceedings{navarro_et_al:LIPIcs.CPM.2023.25,
author = {Navarro, Gonzalo and Urbina, Cristian},
title = {{L-Systems for Measuring Repetitiveness}},
booktitle = {34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)},
pages = {25:1--25:17},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-276-1},
ISSN = {1868-8969},
year = {2023},
volume = {259},
editor = {Bulteau, Laurent and Lipt\'{a}k, Zsuzsanna},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2023/17979},
URN = {urn:nbn:de:0030-drops-179792},
doi = {10.4230/LIPIcs.CPM.2023.25},
annote = {Keywords: L-systems, String morphisms, Repetitiveness measures, Text compression}
}
Keywords: |
|
L-systems, String morphisms, Repetitiveness measures, Text compression |
Collection: |
|
34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023) |
Issue Date: |
|
2023 |
Date of publication: |
|
21.06.2023 |