License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.GIScience.2023.11
URN: urn:nbn:de:0030-drops-189064
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2023/18906/
Go to the corresponding LIPIcs Volume Portal


Wiedemann, Nina ; Martin, Henry ; Westerholt, René

Benchmarking Regression Models Under Spatial Heterogeneity

pdf-format:
LIPIcs-GIScience-2023-11.pdf (2 MB)


Abstract

Machine learning methods have recently found much application on spatial data, for example in weather forecasting, traffic prediction, and soil analysis. At the same time, methods from spatial statistics were developed over the past decades to explicitly account for spatial structuring in analytical and inference tasks. In the light of this duality of having both types of methods available, we explore the following question: Under what circumstances are local, spatially-explicit models preferable over machine learning models that do not incorporate spatial structure explicitly in their specification? Local models are typically used to capture spatial non-stationarity. Thus, we study the effect of strength and type of spatial heterogeneity, which may originate from non-stationarity of a process itself or from heterogeneous noise, on the performance of different linear and non-linear, local and global machine learning and regression models. The results suggest that it is necessary to assess the performance of linear local models on an independent hold-out dataset, since models may overfit under certain conditions. We further show that local models are advantageous in settings with small sample size and high degrees of spatial heterogeneity. Our findings allow deriving model selection criteria, which are validated in benchmarking experiments on five well-known spatial datasets.

BibTeX - Entry

@InProceedings{wiedemann_et_al:LIPIcs.GIScience.2023.11,
  author =	{Wiedemann, Nina and Martin, Henry and Westerholt, Ren\'{e}},
  title =	{{Benchmarking Regression Models Under Spatial Heterogeneity}},
  booktitle =	{12th International Conference on Geographic Information Science (GIScience 2023)},
  pages =	{11:1--11:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-288-4},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{277},
  editor =	{Beecham, Roger and Long, Jed A. and Smith, Dianna and Zhao, Qunshan and Wise, Sarah},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2023/18906},
  URN =		{urn:nbn:de:0030-drops-189064},
  doi =		{10.4230/LIPIcs.GIScience.2023.11},
  annote =	{Keywords: spatial machine learning, spatial non-stationarity, Geographically Weighted Regression, local models, geostatistics}
}

Keywords: spatial machine learning, spatial non-stationarity, Geographically Weighted Regression, local models, geostatistics
Collection: 12th International Conference on Geographic Information Science (GIScience 2023)
Issue Date: 2023
Date of publication: 07.09.2023
Supplementary Material: Software (Source Code): https://github.com/mie-lab/spatial_rf_python archived at: https://archive.softwareheritage.org/swh:1:dir:f7fc9b237a4e80f882b1785718ab884c2770ac66


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI