License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.STACS.2021.34
URN: urn:nbn:de:0030-drops-136796
Go to the corresponding LIPIcs Volume Portal

Gawrychowski, Paweł ; Kosche, Maria ; Koß, Tore ; Manea, Florin ; Siemer, Stefan

Efficiently Testing Simon’s Congruence

LIPIcs-STACS-2021-34.pdf (1.0 MB)


Simon’s congruence ∼_k is a relation on words defined by Imre Simon in the 1970s and intensely studied since then. This congruence was initially used in connection to piecewise testable languages, but also found many applications in, e.g., learning theory, databases theory, or linguistics. The ∼_k-relation is defined as follows: two words are ∼_k-congruent if they have the same set of subsequences of length at most k. A long standing open problem, stated already by Simon in his initial works on this topic, was to design an algorithm which computes, given two words s and t, the largest k for which s∼_k t. We propose the first algorithm solving this problem in linear time O(|s|+|t|) when the input words are over the integer alphabet {1,…,|s|+|t|} (or other alphabets which can be sorted in linear time). Our approach can be extended to an optimal algorithm in the case of general alphabets as well.
To achieve these results, we introduce a novel data-structure, called Simon-Tree, which allows us to construct a natural representation of the equivalence classes induced by ∼_k on the set of suffixes of a word, for all k ≥ 1. We show that such a tree can be constructed for an input word in linear time. Then, when working with two words s and t, we compute their respective Simon-Trees and efficiently build a correspondence between the nodes of these trees. This correspondence, which can also be constructed in linear time O(|s|+|t|), allows us to retrieve the largest k for which s∼_k t.

BibTeX - Entry

  author =	{Gawrychowski, Pawe{\l} and Kosche, Maria and Ko{\ss}, Tore and Manea, Florin and Siemer, Stefan},
  title =	{{Efficiently Testing Simon’s Congruence}},
  booktitle =	{38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021)},
  pages =	{34:1--34:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-180-1},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{187},
  editor =	{Bl\"{a}ser, Markus and Monmege, Benjamin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{},
  URN =		{urn:nbn:de:0030-drops-136796},
  doi =		{10.4230/LIPIcs.STACS.2021.34},
  annote =	{Keywords: Simon’s congruence, Subsequence, Scattered factor, Efficient algorithms}

Keywords: Simon’s congruence, Subsequence, Scattered factor, Efficient algorithms
Collection: 38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021)
Issue Date: 2021
Date of publication: 10.03.2021

DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI