License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CP.2021.26
URN: urn:nbn:de:0030-drops-153171
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2021/15317/
Go to the corresponding LIPIcs Volume Portal


Galleguillos, Cristian ; Kiziltan, Zeynep ; Soto, Ricardo

A Job Dispatcher for Large and Heterogeneous HPC Systems Running Modern Applications

pdf-format:
LIPIcs-CP-2021-26.pdf (3 MB)


Abstract

High-performance Computing (HPC) systems have become essential instruments in our modern society. As they get closer to exascale performance, HPC systems become larger in size and more heterogeneous in their computing resources. With recent advances in AI, HPC systems are also increasingly being used for applications that employ many short jobs with strict timing requirements. HPC job dispatchers need to therefore adopt techniques to go beyond the capabilities of those developed for small or homogeneous systems, or for traditional compute-intensive applications. In this paper, we present a job dispatcher suitable for today’s large and heterogeneous systems running modern applications. Unlike its predecessors, our dispatcher solves the entire dispatching problem using Constraint Programming (CP) with a model size independent of the system size. Experimental results based on a simulation study show that our approach can bring about significant performance gains over the existing CP-based dispatchers in a large or heterogeneous system.

BibTeX - Entry

@InProceedings{galleguillos_et_al:LIPIcs.CP.2021.26,
  author =	{Galleguillos, Cristian and Kiziltan, Zeynep and Soto, Ricardo},
  title =	{{A Job Dispatcher for Large and Heterogeneous HPC Systems Running Modern Applications}},
  booktitle =	{27th International Conference on Principles and Practice of Constraint Programming (CP 2021)},
  pages =	{26:1--26:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-211-2},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{210},
  editor =	{Michel, Laurent D.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2021/15317},
  URN =		{urn:nbn:de:0030-drops-153171},
  doi =		{10.4230/LIPIcs.CP.2021.26},
  annote =	{Keywords: Constraint programming, HPC systems, heterogeneous systems, large systems, on-line job dispatching, resource allocation}
}

Keywords: Constraint programming, HPC systems, heterogeneous systems, large systems, on-line job dispatching, resource allocation
Collection: 27th International Conference on Principles and Practice of Constraint Programming (CP 2021)
Issue Date: 2021
Date of publication: 15.10.2021
Supplementary Material: Software (Source Code): https://git.io/fjia1 archived at: https://archive.softwareheritage.org/swh:1:dir:763b27390dd764a27ae8e7dff5dc0d724cbabe88


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI