skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, May 23 until 2:00 AM ET on Friday, May 24 due to maintenance. We apologize for the inconvenience.

Title: Land-surface parameters for spatial predictive mapping and modeling
Land-surface parameters derived from digital land surface models (DLSMs) (for example, slope, surface curvature, topographic position, topographic roughness, aspect, heat load index, and topographic moisture index) can serve as key predictor variables in a wide variety of mapping and modeling tasks relating to geomorphic processes, landform delineation, ecological and habitat characterization, and geohazard, soil, wetland, and general thematic mapping and modeling. However, selecting features from the large number of potential derivatives that may be predictive for a specific feature or process can be complicated, and existing literature may offer contradictory or incomplete guidance. The availability of multiple data sources and the need to define moving window shapes, sizes, and cell weightings further complicate selecting and optimizing the feature space. This review focuses on the calculation and use of DLSM parameters for empirical spatial predictive modeling applications, which rely on training data and explanatory variables to make predictions of landscape features and processes over a defined geographic extent. The target audience for this review is researchers and analysts undertaking predictive modeling tasks that make use of the most widely used terrain variables. To outline best practices and highlight future research needs, we review a range of land-surface parameters relating to steepness, local relief, rugosity, slope orientation, solar insolation, and moisture and characterize their relationship to geomorphic processes. We then discuss important considerations when selecting such parameters for predictive mapping and modeling tasks to assist analysts in answering two critical questions: What landscape conditions or processes does a given measure characterize? How might a particular metric relate to the phenomenon or features being mapped, modeled, or studied? We recommend the use of landscape- and problem-specific pilot studies to answer, to the extent possible, these questions for potential features of interest in a mapping or modeling task. We describe existing techniques to reduce the size of the feature space using feature selection and feature reduction methods, assess the importance or contribution of specific metrics, and parameterize moving windows or characterize the landscape at varying scales using alternative methods while highlighting strengths, drawbacks, and knowledge gaps for specific techniques. Recent developments, such as explainable machine learning and convolutional neural network (CNN)-based deep learning, may guide and/or minimize the need for feature space engineering and ease the use of DLSMs in predictive modeling tasks.  more » « less
Award ID(s):
Author(s) / Creator(s):
Date Published:
Journal Name:
Earthscience reviews
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Many studies of Earth surface processes and landscape evolution rely on having accurate and extensive data sets of surficial geologic units and landforms. Automated extraction of geomorphic features using deep learning provides an objective way to consistently map landforms over large spatial extents. However, there is no consensus on the optimal input feature space for such analyses. We explore the impact of input feature space for extracting geomorphic features from land surface parameters (LSPs) derived from digital terrain models (DTMs) using convolutional neural network (CNN)‐based semantic segmentation deep learning. We compare four input feature space configurations: (a) a three‐layer composite consisting of a topographic position index (TPI) calculated using a 50 m radius circular window, square root of topographic slope, and TPI calculated using an annulus with a 2 m inner radius and 10 m outer radius, (b) a single illuminating position hillshade, (c) a multidirectional hillshade, and (d) a slopeshade. We test each feature space input using three deep learning algorithms and four use cases: two with natural features and two with anthropogenic features. The three‐layer composite generally provided lower overall losses for the training samples, a higher F1‐score for the withheld validation data, and better performance for generalizing to withheld testing data from a new geographic extent. Results suggest that CNN‐based deep learning for mapping geomorphic features or landforms from LSPs is sensitive to input feature space. Given the large number of LSPs that can be derived from DTM data and the variety of geomorphic mapping tasks that can be undertaken using CNN‐based methods, we argue that additional research focused on feature space considerations is needed and suggest future research directions. We also suggest that the three‐layer composite implemented here can offer better performance in comparison to using hillshades or other common terrain visualization surfaces and is, thus, worth considering for different mapping and feature extraction tasks.

    more » « less
  2. Machine learning (ML) methods, such as artificial neural networks (ANN), k-nearest neighbors (kNN), random forests (RF), support vector machines (SVM), and boosted decision trees (DTs), may offer stronger predictive performance than more traditional, parametric methods, such as linear regression, multiple linear regression, and logistic regression (LR), for specific mapping and modeling tasks. However, this increased performance is often accompanied by increased model complexity and decreased interpretability, resulting in critiques of their “black box” nature, which highlights the need for algorithms that can offer both strong predictive performance and interpretability. This is especially true when the global model and predictions for specific data points need to be explainable in order for the model to be of use. Explainable boosting machines (EBM), an augmentation and refinement of generalize additive models (GAMs), has been proposed as an empirical modeling method that offers both interpretable results and strong predictive performance. The trained model can be graphically summarized as a set of functions relating each predictor variable to the dependent variable along with heat maps representing interactions between selected pairs of predictor variables. In this study, we assess EBMs for predicting the likelihood or probability of slope failure occurrence based on digital terrain characteristics in four separate Major Land Resource Areas (MLRAs) in the state of West Virginia, USA and compare the results to those obtained with LR, kNN, RF, and SVM. EBM provided predictive accuracies comparable to RF and SVM and better than LR and kNN. The generated functions and visualizations for each predictor variable and included interactions between pairs of predictor variables, estimation of variable importance based on average mean absolute scores, and provided scores for each predictor variable for new predictions add interpretability, but additional work is needed to quantify how these outputs may be impacted by variable correlation, inclusion of interaction terms, and large feature spaces. Further exploration of EBM is merited for geohazard mapping and modeling in particular and spatial predictive mapping and modeling in general, especially when the value or use of the resulting predictions would be greatly enhanced by improved interpretability globally and availability of prediction explanations at each cell or aggregating unit within the mapped or modeled extent. 
    more » « less
  3. Abstract. Many scientists have begun to refer to the earth surface environment from the upper canopy to the depths of bedrock as the critical zone (CZ). Identification of the CZ as an integral object worthy of study implicitly posits that the study of the whole earth surface will provide benefits that do not arise when studying the individual parts. To study the CZ, however, requires prioritizing among the measurements that can be made – and we do not generally agree on the priorities. Currently, the Susquehanna Shale Hills Critical Zone Observatory (SSHCZO) is expanding from a small original focus area (0.08km2, Shale Hills catchment), to a larger watershed (164km2, Shavers Creek watershed) and is grappling with the prioritization. This effort is an expansion from a monolithologic first-order forested catchment to a watershed that encompasses several lithologies (shale, sandstone, limestone) and land use types (forest, agriculture). The goal of the project remains the same: to understand water, energy, gas, solute, and sediment (WEGSS) fluxes that are occurring today in the context of the record of those fluxes over geologic time as recorded in soil profiles, the sedimentary record, and landscape morphology.

    Given the small size of the Shale Hills catchment, the original design incorporated measurement of as many parameters as possible at high temporal and spatial density. In the larger Shavers Creek watershed, however, we must focus the measurements. We describe a strategy of data collection and modeling based on a geomorphological and land use framework that builds on the hillslope as the basic unit. Interpolation and extrapolation beyond specific sites relies on geophysical surveying, remote sensing, geomorphic analysis, the study of natural integrators such as streams, groundwaters or air, and application of a suite of CZ models. We hypothesize that measurements of a few important variables at strategic locations within a geomorphological framework will allow development of predictive models of CZ behavior. In turn, the measurements and models will reveal how the larger watershed will respond to perturbations both now and into the future.

    more » « less
  4. Contemporary climate change in Alaska has resulted in amplified rates of press and pulse disturbances that drive ecosystem change with significant consequences for socio‐environmental systems. Despite the vulnerability of Arctic and boreal landscapes to change, little has been done to characterize landscape change and associated drivers across northern high‐latitude ecosystems. Here we characterize the historical sensitivity of Alaska's ecosystems to environmental change and anthropogenic disturbances using expert knowledge, remote sensing data, and spatiotemporal analyses and modeling. Time‐series analysis of moderate—and high‐resolution imagery was used to characterize land‐ and water‐surface dynamics across Alaska. Some 430,000 interpretations of ecological and geomorphological change were made using historical air photos and satellite imagery, and corroborate land‐surface greening, browning, and wetness/moisture trend parameters derived from peak‐growing season Landsat imagery acquired from 1984 to 2015. The time series of change metrics, together with climatic data and maps of landscape characteristics, were incorporated into a modeling framework for mapping and understanding of drivers of change throughout Alaska. According to our analysis, approximately 13% (~174,000 ± 8700 km2) of Alaska has experienced directional change in the last 32 years (±95% confidence intervals). At the ecoregions level, substantial increases in remotely sensed vegetation productivity were most pronounced in western and northern foothills of Alaska, which is explained by vegetation growth associated with increasing air temperatures. Significant browning trends were largely the result of recent wildfires in interior Alaska, but browning trends are also driven by increases in evaporative demand and surface‐water gains that have predominately occurred over warming permafrost landscapes. Increased rates of photosynthetic activity are associated with stabilization and recovery processes following wildfire, timber harvesting, insect damage, thermokarst, glacial retreat, and lake infilling and drainage events. Our results fill a critical gap in the understanding of historical and potential future trajectories of change in northern high‐latitude regions. 
    more » « less
  5. null (Ed.)
    Abstract. Land models are essential tools for understanding and predicting terrestrial processes and climate–carbon feedbacks in the Earth system, but uncertainties in their future projections are poorly understood. Improvements in physical process realism and the representation of human influence arguably make models more comparable to reality but also increase the degrees of freedom in model configuration, leading to increased parametric uncertainty in projections. In this work we design and implement a machine learning approach to globally calibrate a subset of the parameters of the Community Land Model, version 5 (CLM5) to observations of carbon and water fluxes. We focus on parameters controlling biophysical features such as surface energy balance, hydrology, and carbon uptake. We first use parameter sensitivity simulations and a combination of objective metrics including ranked global mean sensitivity to multiple output variables and non-overlapping spatial pattern responses between parameters to narrow the parameter space and determine a subset of important CLM5 biophysical parameters for further analysis. Using a perturbed parameter ensemble, we then train a series of artificial feed-forward neural networks to emulate CLM5 output given parameter values as input. We use annual mean globally aggregated spatial variability in carbon and water fluxes as our emulation and calibration targets. Validation and out-of-sample tests are used to assess the predictive skill of the networks, and we utilize permutation feature importance and partial dependence methods to better interpret the results. The trained networks are then used to estimate global optimal parameter values with greater computational efficiency than achieved by hand tuning efforts and increased spatial scale relative to previous studies optimizing at a single site. By developing this methodology, our framework can help quantify the contribution of parameter uncertainty to overall uncertainty in land model projections. 
    more » « less