Cross-validation for Geospatial Data: Estimating Generalization Performance in Geostatistical Problems

Wang, Jing; Hopkins, Laurel; Hallman, Tyler A; Robinson, W Douglas; Hutchinson, Rebecca A

Citation Details

Geostatistical learning problems are frequently characterized by spatial autocorrelation in the input features and/or the potential for covariate shift at test time. These realities violate the classical assumption of independent, identically distributed data, upon which most cross-validation algorithms rely in order to estimate the generalization performance of a model. In this paper, we present a theoretical criterion for unbiased cross-validation estimators in the geospatial setting. We also introduce a new cross-validation algorithm to evaluate models, inspired by the challenges of geospatial problems. We apply a framework for categorizing problems into different types of geospatial scenarios to help practitioners select an appropriate cross-validation strategy. Our empirical analyses compare cross-validation algorithms on both simulated and several real datasets to develop recommendations for a variety of geospatial settings. This paper aims to draw attention to some challenges that arise in model evaluation for geospatial problems and to provide guidance for users. more »

Award ID(s):: 2046678

PAR ID:: 10514920

Author(s) / Creator(s):: Wang, Jing; Hopkins, Laurel; Hallman, Tyler A; Robinson, W Douglas; Hutchinson, Rebecca A

Publisher / Repository:: Transactions on Machine Learning Research

Date Published:: 2023-10-01

Journal Name:: Transactions on Machine Learning Research

ISSN:: 2835-8856

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
The DOI is not currently available.

More Like this