License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.ECRTS.2018.23
URN: urn:nbn:de:0030-drops-89808
Go to the corresponding LIPIcs Volume Portal

Bhat, Anand ; Samii, Soheil ; Rajkumar, Ragunathan (Raj)

Recovery Time Considerations in Real-Time Systems Employing Software Fault Tolerance

LIPIcs-ECRTS-2018-23.pdf (4 MB)


Safety-critical real-time systems like modern automobiles with advanced driving-assist features must employ redundancy for crucial software tasks to tolerate permanent crash faults. This redundancy can be achieved by using techniques like active replication or the primary-backup approach. In such systems, the recovery time which is the amount of time it takes for a redundant task to take over execution on the failure of a primary task becomes a very important design parameter. The recovery time for a given task depends on various factors like task allocation, primary and redundant task priorities, system load and the scheduling policy. Each task can also have a different recovery time requirement (RTR). For example, in automobiles with automated driving features, safety-critical tasks like perception and steering control have strict RTRs, whereas such requirements are more relaxed in the case of tasks like heating control and mission planning. In this paper, we analyze the recovery time for software tasks in a real-time system employing Rate-Monotonic Scheduling (RMS). We derive bounds on the recovery times for different redundant task options and propose techniques to determine the redundant-task type for a task to satisfy its RTR. We also address the fault-tolerant task allocation problem, with the additional constraint of satisfying the RTR of each task in the system. Given that the problem of assigning tasks to processors is a well-known NP-hard bin-packing problem we propose computationally-efficient heuristics to find a feasible allocation of tasks and their redundant copies. We also apply the simulated annealing method to the fault-tolerant task allocation problem with RTR constraints and compare against our heuristics.

BibTeX - Entry

  author =	{Anand Bhat and Soheil Samii and Ragunathan (Raj) Rajkumar},
  title =	{{Recovery Time Considerations in Real-Time Systems Employing Software Fault Tolerance}},
  booktitle =	{30th Euromicro Conference on Real-Time Systems (ECRTS 2018)},
  pages =	{23:1--23:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-075-0},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{106},
  editor =	{Sebastian Altmeyer},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{},
  URN =		{urn:nbn:de:0030-drops-89808},
  doi =		{10.4230/LIPIcs.ECRTS.2018.23},
  annote =	{Keywords: fault tolerance, real-time embedded systems, recovery time, real-time schedulability}

Keywords: fault tolerance, real-time embedded systems, recovery time, real-time schedulability
Collection: 30th Euromicro Conference on Real-Time Systems (ECRTS 2018)
Issue Date: 2018
Date of publication: 22.06.2018

DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI