License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.ICDT.2022.18
URN: urn:nbn:de:0030-drops-158922
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2022/15892/
Go to the corresponding LIPIcs Volume Portal


Kleest-Meißner, Sarah ; Sattler, Rebecca ; Schmid, Markus L. ; Schweikardt, Nicole ; Weidlich, Matthias

Discovering Event Queries from Traces: Laying Foundations for Subsequence-Queries with Wildcards and Gap-Size Constraints

pdf-format:
LIPIcs-ICDT-2022-18.pdf (0.9 MB)


Abstract

We introduce subsequence-queries with wildcards and gap-size constraints (swg-queries, for short) as a tool for querying event traces. An swg-query q is given by a string s over an alphabet of variables and types, a global window size w, and a tuple c = ((c^-_1, c^+_1), (c^-_2, c^+_2), …, (c^-_{|s|-1}, c^+_{|s|-1})) of local gap-size constraints over ℕ × (ℕ ∪ {∞}). The query q matches in a trace t (i. e., a sequence of types) if the variables can uniformly be substituted by types such that the resulting string occurs in t as a subsequence that spans an area of length at most w, and the i^{th} gap of the subsequence (i. e., the distance between the i^{th} and (i+1)^{th} position of the subsequence) has length at least c^-_i and at most c^+_i. We formalise and investigate the task of discovering an swg-query that describes best the traces from a given sample S of traces, and we present an algorithm solving this task. As a central component, our algorithm repeatedly solves the matching problem (i. e., deciding whether a given query q matches in a given trace t), which is an NP-complete problem (in combined complexity). Hence, the matching problem is of special interest in the context of query discovery, and we therefore subject it to a detailed (parameterised) complexity analysis to identify tractable subclasses, which lead to tractable subclasses of the discovery problem as well. We complement this by a reduction proving that any query discovery algorithm also yields an algorithm for the matching problem. Hence, lower bounds on the complexity of the matching problem directly translate into according lower bounds of the query discovery problem. As a proof of concept, we also implemented a prototype of our algorithm and tested it on real-world data.

BibTeX - Entry

@InProceedings{kleestmeiner_et_al:LIPIcs.ICDT.2022.18,
  author =	{Kleest-Mei{\ss}ner, Sarah and Sattler, Rebecca and Schmid, Markus L. and Schweikardt, Nicole and Weidlich, Matthias},
  title =	{{Discovering Event Queries from Traces: Laying Foundations for Subsequence-Queries with Wildcards and Gap-Size Constraints}},
  booktitle =	{25th International Conference on Database Theory (ICDT 2022)},
  pages =	{18:1--18:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-223-5},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{220},
  editor =	{Olteanu, Dan and Vortmeier, Nils},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2022/15892},
  URN =		{urn:nbn:de:0030-drops-158922},
  doi =		{10.4230/LIPIcs.ICDT.2022.18},
  annote =	{Keywords: event queries on traces, pattern queries on strings, learning descriptive queries, complexity of query evaluation and query learning}
}

Keywords: event queries on traces, pattern queries on strings, learning descriptive queries, complexity of query evaluation and query learning
Collection: 25th International Conference on Database Theory (ICDT 2022)
Issue Date: 2022
Date of publication: 19.03.2022
Supplementary Material: Audiovisual (Video of the Presentation): https://doi.org/10.5446/57490


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI