License: Creative Commons Attribution 3.0 Germany license (CC BY 3.0 DE)
When quoting this document, please refer to the following
DOI: 10.4230/DARTS.5.1.4
URN: urn:nbn:de:0030-drops-107322
Go back to Dagstuhl Artifacts Series

Cavicchioli, Roberto ; Capodieci, Nicola ; Solieri, Marco ; Bertogna, Marko

API Comparison of CPU-To-GPU Command Offloading Latency on Embedded Platforms (Artifact)

DARTS-5-1-4.pdf (0.3 MB)


High-performance heterogeneous embedded platforms allow offloading of parallel workloads to an integrated accelerator, such as General Purpose-Graphic Processing Units (GP-GPUs). A time-predictable characterization of task submission is a must in real-time applications. We provide a profiler of the time spent by the CPU for submitting stereotypical GP-GPU workload shaped as a Deep Neural Network of parameterized complexity. The submission is performed using the latest API available: NVIDIA CUDA, including its various techniques, and Vulkan. Complete automation for the test on Jetson Xavier is also provided by scripts that install software dependencies, run the experiments, and collect results in a PDF report.

BibTeX - Entry

  author =	{Roberto Cavicchioli and Nicola Capodieci and Marco Solieri and Marko Bertogna},
  title =	{{API Comparison of CPU-To-GPU Command Offloading Latency on Embedded Platforms (Artifact)}},
  pages =	{4:1--4:3},
  journal =	{Dagstuhl Artifacts Series},
  ISSN =	{2509-8195},
  year =	{2019},
  volume =	{5},
  number =	{1},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{},
  URN =		{urn:nbn:de:0030-drops-107322},
  doi =		{10.4230/DARTS.5.1.4},
  annote =	{Keywords: GPU, Applications, Heterogeneus systems}

Keywords: GPU, Applications, Heterogeneus systems
Collection: Special Issue of the 31st Euromicro Conference on Real-Time Systems (ECRTS 2019)
Related Scholarly Article:
Issue Date: 2019
Date of publication: 08.07.2019

DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI