License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CONCUR.2023.35
URN: urn:nbn:de:0030-drops-190290
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2023/19029/
Go to the corresponding LIPIcs Volume Portal


Schewe, Sven ; Tang, Qiyi ; Zhanabekova, Tansholpan

Deciding What Is Good-For-MDPs

pdf-format:
LIPIcs-CONCUR-2023-35.pdf (1 MB)


Abstract

Nondeterministic good-for-MDPs (GFM) automata are for MDP model checking and reinforcement learning what good-for-games automata are for reactive synthesis: a more compact alternative to deterministic automata that displays nondeterminism, but only so much that it can be resolved locally, such that a syntactic product can be analysed. GFM has recently been introduced as a property for reinforcement learning, where the simpler Büchi acceptance conditions it allows to use is key. However, while there are classic and novel techniques to obtain automata that are GFM, there has not been a decision procedure for checking whether or not an automaton is GFM. We show that GFM-ness is decidable and provide an EXPTIME decision procedure as well as a PSPACE-hardness proof.

BibTeX - Entry

@InProceedings{schewe_et_al:LIPIcs.CONCUR.2023.35,
  author =	{Schewe, Sven and Tang, Qiyi and Zhanabekova, Tansholpan},
  title =	{{Deciding What Is Good-For-MDPs}},
  booktitle =	{34th International Conference on Concurrency Theory (CONCUR 2023)},
  pages =	{35:1--35:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-299-0},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{279},
  editor =	{P\'{e}rez, Guillermo A. and Raskin, Jean-Fran\c{c}ois},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2023/19029},
  URN =		{urn:nbn:de:0030-drops-190290},
  doi =		{10.4230/LIPIcs.CONCUR.2023.35},
  annote =	{Keywords: B\"{u}chi automata, Markov Decision Processes, Omega-regular objectives, Reinforcement learning}
}

Keywords: Büchi automata, Markov Decision Processes, Omega-regular objectives, Reinforcement learning
Collection: 34th International Conference on Concurrency Theory (CONCUR 2023)
Issue Date: 2023
Date of publication: 07.09.2023


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI