License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/DagRep.5.7.1
URN: urn:nbn:de:0030-drops-56705
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2016/5670/
Go back to Dagstuhl Reports


Casanova, Henri ; Deelman, Ewa ; Robert, Yves ; Schwiegelshohn, Uwe
Weitere Beteiligte (Hrsg. etc.): Henri Casanova and Ewa Deelman and Yves Robert and Uwe Schwiegelshohn

Algorithms and Scheduling Techniques to Manage Resilience and Power Consumption in Distributed Systems (Dagstuhl Seminar 15281)

pdf-format:
dagrep_v005_i007_p001_s15281.pdf (0.8 MB)


Abstract

Large-scale systems face two main challenges: failure management and energy management. Failure management, the goal of which is to achieve resilience, is necessary because a large number of hardware resources implies a large number of failures during the execution of an application. Energy management, the goal of which is to optimize of power consumption and to handle thermal issues, is also necessary due to both monetary and environmental constraints since typical applications executed in HPC and/or cloud environments will lead to large power consumption and heat dissipation due to intensive computation and communication workloads.

The main objective of this Dagstuhl seminar was to gather two communities: (i)~system-oriented researchers who study high-level resource-provisioning policies, pragmatic resource allocation and scheduling heuristics, novel approaches for designing and deploying systems software infrastructures, and tools for monitoring/measuring the state of the system; and (ii)~algorithm-oriented researchers, who investigate formal models and algorithmic solutions for resilience and energy efficiency problems. Both communities focused around workflow applications during the seminar, and discussed various issues related to the efficient, resilient, and energy efficient execution of workflows in distributed platforms.

This report provides a brief executive summary of the seminar and lists all the presented material.

BibTeX - Entry

@Article{casanova_et_al:DR:2016:5670,
  author =	{Henri Casanova and Ewa Deelman and Yves Robert and Uwe Schwiegelshohn},
  title =	{{Algorithms and Scheduling Techniques to Manage Resilience and Power Consumption in Distributed Systems (Dagstuhl Seminar 15281)}},
  pages =	{1--21},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2016},
  volume =	{5},
  number =	{7},
  editor =	{Henri Casanova and Ewa Deelman and Yves Robert and Uwe Schwiegelshohn},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2016/5670},
  URN =		{urn:nbn:de:0030-drops-56705},
  doi =		{10.4230/DagRep.5.7.1},
  annote =	{Keywords: Fault tolerance, Resilience, Energy efficiency, Distributed and high performance computing, Scheduling, Workflows}
}

Keywords: Fault tolerance, Resilience, Energy efficiency, Distributed and high performance computing, Scheduling, Workflows
Collection: Dagstuhl Reports, Volume 5, Issue 7
Issue Date: 2016
Date of publication: 13.01.2016


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI