License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.SEA.2020.12
URN: urn:nbn:de:0030-drops-120862
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2020/12086/
Go to the corresponding LIPIcs Volume Portal


Lipták, Zsuzsanna ; Puglisi, Simon J. ; Rossi, Massimiliano

Pattern Discovery in Colored Strings

pdf-format:
LIPIcs-SEA-2020-12.pdf (3 MB)


Abstract

We consider the problem of identifying patterns of interest in colored strings. A colored string is a string in which each position is colored with one of a finite set of colors. Our task is to find substrings that always occur followed by the same color at the same distance. The problem is motivated by applications in embedded systems verification, in particular, assertion mining. The goal there is to automatically infer properties of the embedded system from the analysis of its simulation traces. We show that the number of interesting patterns is upper-bounded by ?(n²) where n is the length of the string. We introduce a baseline algorithm with ?(n²) running time which identifies all interesting patterns for all colors in the string satisfying certain minimality conditions. When one is interested in patterns related to only one color, we provide an algorithm that identifies patterns in ?(n²log n) time, but is faster than the first algorithm in practice, both on simulated and on real-world patterns.

BibTeX - Entry

@InProceedings{liptk_et_al:LIPIcs:2020:12086,
  author =	{Zsuzsanna Lipt{\'a}k and Simon J. Puglisi and Massimiliano Rossi},
  title =	{{Pattern Discovery in Colored Strings}},
  booktitle =	{18th International Symposium on Experimental Algorithms (SEA 2020)},
  pages =	{12:1--12:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-148-1},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{160},
  editor =	{Simone Faro and Domenico Cantone},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2020/12086},
  URN =		{urn:nbn:de:0030-drops-120862},
  doi =		{10.4230/LIPIcs.SEA.2020.12},
  annote =	{Keywords: property testing, suffix tree, pattern mining}
}

Keywords: property testing, suffix tree, pattern mining
Collection: 18th International Symposium on Experimental Algorithms (SEA 2020)
Issue Date: 2020
Date of publication: 12.06.2020
Supplementary Material: An implementation of the algorithms is available online at https://github.com/maxrossi91/colored-strings-miner.


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI