License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.OPODIS.2021.26
URN: urn:nbn:de:0030-drops-158016
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2022/15801/
Go to the corresponding LIPIcs Volume Portal


Schultz, William ; Zhou, Siyuan ; Dardik, Ian ; Tripakis, Stavros

Design and Analysis of a Logless Dynamic Reconfiguration Protocol

pdf-format:
LIPIcs-OPODIS-2021-26.pdf (0.9 MB)


Abstract

Distributed replication systems based on the replicated state machine model have become ubiquitous as the foundation of modern database systems. To ensure availability in the presence of faults, these systems must be able to dynamically replace failed nodes with healthy ones via dynamic reconfiguration. MongoDB is a document oriented database with a distributed replication mechanism derived from the Raft protocol. In this paper, we present MongoRaftReconfig, a novel dynamic reconfiguration protocol for the MongoDB replication system. MongoRaftReconfig utilizes a logless approach to managing configuration state and decouples the processing of configuration changes from the main database operation log. The protocol’s design was influenced by engineering constraints faced when attempting to redesign an unsafe, legacy reconfiguration mechanism that existed previously in MongoDB. We provide a safety proof of MongoRaftReconfig, along with a formal specification in TLA+. To our knowledge, this is the first published safety proof and formal specification of a reconfiguration protocol for a Raft-based system. We also present results from model checking the safety properties of MongoRaftReconfig on finite protocol instances. Finally, we discuss the conceptual novelties of MongoRaftReconfig, how it can be understood as an optimized and generalized version of the single server reconfiguration algorithm of Raft, and present an experimental evaluation of how its optimizations can provide performance benefits for reconfigurations.

BibTeX - Entry

@InProceedings{schultz_et_al:LIPIcs.OPODIS.2021.26,
  author =	{Schultz, William and Zhou, Siyuan and Dardik, Ian and Tripakis, Stavros},
  title =	{{Design and Analysis of a Logless Dynamic Reconfiguration Protocol}},
  booktitle =	{25th International Conference on Principles of Distributed Systems (OPODIS 2021)},
  pages =	{26:1--26:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-219-8},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{217},
  editor =	{Bramas, Quentin and Gramoli, Vincent and Milani, Alessia},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2022/15801},
  URN =		{urn:nbn:de:0030-drops-158016},
  doi =		{10.4230/LIPIcs.OPODIS.2021.26},
  annote =	{Keywords: Fault Tolerance, Dynamic Reconfiguration, State Machine Replication}
}

Keywords: Fault Tolerance, Dynamic Reconfiguration, State Machine Replication
Collection: 25th International Conference on Principles of Distributed Systems (OPODIS 2021)
Issue Date: 2022
Date of publication: 28.02.2022
Supplementary Material: Software (TLA+ specifications): https://doi.org/10.5281/zenodo.5715510


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI