License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.ECRTS.2023.15
URN: urn:nbn:de:0030-drops-180445
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2023/18044/
Go to the corresponding LIPIcs Volume Portal


Nikiema, Pegdwende Romaric ; Kritikakou, Angeliki ; Traiola, Marcello ; Sentieys, Olivier

Impact of Transient Faults on Timing Behavior and Mitigation with Near-Zero WCET Overhead

pdf-format:
LIPIcs-ECRTS-2023-15.pdf (1 MB)


Abstract

As time-critical systems require timing guarantees, Worst-Case Execution Times (WCET) have to be employed. However, WCET estimation methods usually assume fault-free hardware. If proper actions are not taken, such fault-free WCET approaches become unsafe, when faults impact the hardware during execution. The majority of approaches, dealing with hardware faults, address the impact of faults on the functional behavior of an application, i.e., denial of service and binary correctness. Few approaches address the impact of faults on the application timing behavior, i.e., time to finish the application, and target faults occurring in memories. However, as the transistor size in modern technologies is significantly reduced, faults in cores cannot be considered negligible anymore. This work shows that faults not only affect the functional behavior, but they can have a significant impact on the timing behavior of applications. To expose the overall impact of faults, we enhance vulnerability analysis to include not only functional, but also timing correctness, and show that faults impact WCET estimations. As common techniques to deal with faults, such as watchdog timers and re-execution, have large timing overhead for error detection and correction, we propose a mechanism with near-zero and bounded timing overhead. A RISC-V core is used as a case study. The obtained results show that faults can lead up to almost 700% increase in the maximum observed execution time between fault-free and faulty execution without protection, affecting the WCET estimations. On the contrary, the proposed mechanism is able to restore fault-free WCET estimations with a bounded overhead of 2 execution cycles.

BibTeX - Entry

@InProceedings{nikiema_et_al:LIPIcs.ECRTS.2023.15,
  author =	{Nikiema, Pegdwende Romaric and Kritikakou, Angeliki and Traiola, Marcello and Sentieys, Olivier},
  title =	{{Impact of Transient Faults on Timing Behavior and Mitigation with Near-Zero WCET Overhead}},
  booktitle =	{35th Euromicro Conference on Real-Time Systems (ECRTS 2023)},
  pages =	{15:1--15:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-280-8},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{262},
  editor =	{Papadopoulos, Alessandro V.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2023/18044},
  URN =		{urn:nbn:de:0030-drops-180445},
  doi =		{10.4230/LIPIcs.ECRTS.2023.15},
  annote =	{Keywords: Transient faults, Timing impact, Near-zero WCET error detection and correction, Vulnerability analysis}
}

Keywords: Transient faults, Timing impact, Near-zero WCET error detection and correction, Vulnerability analysis
Collection: 35th Euromicro Conference on Real-Time Systems (ECRTS 2023)
Issue Date: 2023
Date of publication: 03.07.2023
Supplementary Material: Software (Source Code): https://gitlab.inria.fr/srokicki/Comet/-/tree/FSR_comet?ref_type=heads


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI