License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.MFCS.2023.71
URN: urn:nbn:de:0030-drops-186055
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2023/18605/
Go to the corresponding LIPIcs Volume Portal


Nogami, Taisei ; Terauchi, Tachio

On the Expressive Power of Regular Expressions with Backreferences

pdf-format:
LIPIcs-MFCS-2023-71.pdf (0.8 MB)


Abstract

A rewb is a regular expression extended with a feature called backreference. It is broadly known that backreference is a practical extension of regular expressions, and is supported by most modern regular expression engines, such as those in the standard libraries of Java, Python, and more. Meanwhile, indexed languages are the languages generated by indexed grammars, a formal grammar class proposed by A.V.Aho. We show that these two models' expressive powers are related in the following way: every language described by a rewb is an indexed language. As the smallest formal grammar class previously known to contain rewbs is the class of context sensitive languages, our result strictly improves the known upper-bound. Moreover, we prove the following two claims: there exists a rewb whose language does not belong to the class of stack languages, which is a proper subclass of indexed languages, and the language described by a rewb without a captured reference is in the class of nonerasing stack languages, which is a proper subclass of stack languages. Finally, we show that the hierarchy investigated in a prior study, which separates the expressive power of rewbs by the notion of nested levels, is within the class of nonerasing stack languages.

BibTeX - Entry

@InProceedings{nogami_et_al:LIPIcs.MFCS.2023.71,
  author =	{Nogami, Taisei and Terauchi, Tachio},
  title =	{{On the Expressive Power of Regular Expressions with Backreferences}},
  booktitle =	{48th International Symposium on Mathematical Foundations of Computer Science (MFCS 2023)},
  pages =	{71:1--71:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-292-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{272},
  editor =	{Leroux, J\'{e}r\^{o}me and Lombardy, Sylvain and Peleg, David},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2023/18605},
  URN =		{urn:nbn:de:0030-drops-186055},
  doi =		{10.4230/LIPIcs.MFCS.2023.71},
  annote =	{Keywords: Regular expressions, Backreferences, Expressive power}
}

Keywords: Regular expressions, Backreferences, Expressive power
Collection: 48th International Symposium on Mathematical Foundations of Computer Science (MFCS 2023)
Issue Date: 2023
Date of publication: 21.08.2023


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI