skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Exploring the Influence of Input Feature Space on CNN‐Based Geomorphic Feature Extraction From Digital Terrain Data
Abstract Many studies of Earth surface processes and landscape evolution rely on having accurate and extensive data sets of surficial geologic units and landforms. Automated extraction of geomorphic features using deep learning provides an objective way to consistently map landforms over large spatial extents. However, there is no consensus on the optimal input feature space for such analyses. We explore the impact of input feature space for extracting geomorphic features from land surface parameters (LSPs) derived from digital terrain models (DTMs) using convolutional neural network (CNN)‐based semantic segmentation deep learning. We compare four input feature space configurations: (a) a three‐layer composite consisting of a topographic position index (TPI) calculated using a 50 m radius circular window, square root of topographic slope, and TPI calculated using an annulus with a 2 m inner radius and 10 m outer radius, (b) a single illuminating position hillshade, (c) a multidirectional hillshade, and (d) a slopeshade. We test each feature space input using three deep learning algorithms and four use cases: two with natural features and two with anthropogenic features. The three‐layer composite generally provided lower overall losses for the training samples, a higher F1‐score for the withheld validation data, and better performance for generalizing to withheld testing data from a new geographic extent. Results suggest that CNN‐based deep learning for mapping geomorphic features or landforms from LSPs is sensitive to input feature space. Given the large number of LSPs that can be derived from DTM data and the variety of geomorphic mapping tasks that can be undertaken using CNN‐based methods, we argue that additional research focused on feature space considerations is needed and suggest future research directions. We also suggest that the three‐layer composite implemented here can offer better performance in comparison to using hillshades or other common terrain visualization surfaces and is, thus, worth considering for different mapping and feature extraction tasks.  more » « less
Award ID(s):
2046059
PAR ID:
10420320
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  
Publisher / Repository:
DOI PREFIX: 10.1029
Date Published:
Journal Name:
Earth and Space Science
Volume:
10
Issue:
5
ISSN:
2333-5084
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Semantic segmentation algorithms, such as UNet, that rely on convolutional neural network (CNN)-based architectures, due to their ability to capture local textures and spatial context, have shown promise for anthropogenic geomorphic feature extraction when using land surface parameters (LSPs) derived from digital terrain models (DTMs) as input predictor variables. However, the operationalization of these supervised classification methods is limited by a lack of large volumes of quality training data. This study explores the use of transfer learning, where information learned from another, and often much larger, dataset is used to potentially reduce the need for a large, problem-specific training dataset. Two anthropogenic geomorphic feature extraction problems are explored: the extraction of agricultural terraces and the mapping of surface coal mine reclamation-related valley fill faces. Light detection and ranging (LiDAR)-derived DTMs were used to generate LSPs. We developed custom transfer parameters by attempting to predict geomorphon-based landforms using a large dataset of digital terrain data provided by the United States Geological Survey’s 3D Elevation Program (3DEP). We also explored the use of pre-trained ImageNet parameters and initializing models using parameters learned from the other mapping task investigated. The geomorphon-based transfer learning resulted in the poorest performance while the ImageNet-based parameters generally improved performance in comparison to a random parameter initialization, even when the encoder was frozen or not trained. Transfer learning between the different geomorphic datasets offered minimal benefits. We suggest that pre-trained models developed using large, image-based datasets may be of value for anthropogenic geomorphic feature extraction from LSPs even given the data and task disparities. More specifically, ImageNet-based parameters should be considered as an initialization state for the encoder component of semantic segmentation architectures applied to anthropogenic geomorphic feature extraction even when using non-RGB image-based predictor variables, such as LSPs. The value of transfer learning between the different geomorphic mapping tasks may have been limited due to smaller sample sizes, which highlights the need for continued research in using unsupervised and semi-supervised learning methods, especially given the large volume of digital terrain data available, despite the lack of associated labels. 
    more » « less
  2. Land-surface parameters derived from digital land surface models (DLSMs) (for example, slope, surface curvature, topographic position, topographic roughness, aspect, heat load index, and topographic moisture index) can serve as key predictor variables in a wide variety of mapping and modeling tasks relating to geomorphic processes, landform delineation, ecological and habitat characterization, and geohazard, soil, wetland, and general thematic mapping and modeling. However, selecting features from the large number of potential derivatives that may be predictive for a specific feature or process can be complicated, and existing literature may offer contradictory or incomplete guidance. The availability of multiple data sources and the need to define moving window shapes, sizes, and cell weightings further complicate selecting and optimizing the feature space. This review focuses on the calculation and use of DLSM parameters for empirical spatial predictive modeling applications, which rely on training data and explanatory variables to make predictions of landscape features and processes over a defined geographic extent. The target audience for this review is researchers and analysts undertaking predictive modeling tasks that make use of the most widely used terrain variables. To outline best practices and highlight future research needs, we review a range of land-surface parameters relating to steepness, local relief, rugosity, slope orientation, solar insolation, and moisture and characterize their relationship to geomorphic processes. We then discuss important considerations when selecting such parameters for predictive mapping and modeling tasks to assist analysts in answering two critical questions: What landscape conditions or processes does a given measure characterize? How might a particular metric relate to the phenomenon or features being mapped, modeled, or studied? We recommend the use of landscape- and problem-specific pilot studies to answer, to the extent possible, these questions for potential features of interest in a mapping or modeling task. We describe existing techniques to reduce the size of the feature space using feature selection and feature reduction methods, assess the importance or contribution of specific metrics, and parameterize moving windows or characterize the landscape at varying scales using alternative methods while highlighting strengths, drawbacks, and knowledge gaps for specific techniques. Recent developments, such as explainable machine learning and convolutional neural network (CNN)-based deep learning, may guide and/or minimize the need for feature space engineering and ease the use of DLSMs in predictive modeling tasks. 
    more » « less
  3. null (Ed.)
    Accurate maps of regional surface water features are integral for advancing ecologic, atmospheric and land development studies. The only comprehensive surface water feature map of Alaska is the National Hydrography Dataset (NHD). NHD features are often digitized representations of historic topographic map blue lines and may be outdated. Here we test deep learning methods to automatically extract surface water features from airborne interferometric synthetic aperture radar (IfSAR) data to update and validate Alaska hydrographic databases. U-net artificial neural networks (ANN) and high-performance computing (HPC) are used for supervised hydrographic feature extraction within a study area comprised of 50 contiguous watersheds in Alaska. Surface water features derived from elevation through automated flow-routing and manual editing are used as training data. Model extensibility is tested with a series of 16 U-net models trained with increasing percentages of the study area, from about 3 to 35 percent. Hydrography is predicted by each of the models for all watersheds not used in training. Input raster layers are derived from digital terrain models, digital surface models, and intensity images from the IfSAR data. Results indicate about 15 percent of the study area is required to optimally train the ANN to extract hydrography when F1-scores for tested watersheds average between 66 and 68. Little benefit is gained by training beyond 15 percent of the study area. Fully connected hydrographic networks are generated for the U-net predictions using a novel approach that constrains a D-8 flow-routing approach to follow U-net predictions. This work demonstrates the ability of deep learning to derive surface water feature maps from complex terrain over a broad area. 
    more » « less
  4. An efficient feature selection method can significantly boost results in classification problems. Despite ongoing improvement, hand-designed methods often fail to extract features capturing high- and mid-level representations at effective levels. In machine learning (Deep Learning), recent developments have improved upon these hand-designed methods by utilizing automatic extraction of features. Specifically, Convolutional Neural Networks (CNNs) are a highly successful technique for image classification which can automatically extract features, with ongoing learning and classification of these features. The purpose of this study is to detect hydraulic structures (i.e., bridges and culverts) that are important to overland flow modeling and environmental applications. The dataset used in this work is a relatively small dataset derived from 1-m LiDAR-derived Digital Elevation Models (DEMs) and National Agriculture Imagery Program (NAIP) aerial imagery. The classes for our experiment consist of two groups: the ones with a bridge/culvert being present are considered "True", and those without a bridge/culvert are considered "False". In this paper, we use advanced CNN techniques, including Siamese Neural Networks (SNNs), Capsule Networks (CapsNets), and Graph Convolutional Networks (GCNs), to classify samples with similar topographic and spectral characteristics, an objective which is challenging utilizing traditional machine learning techniques, such as Support Vector Machine (SVM), Gaussian Classifier (GC), and Gaussian Mixture Model (GMM). The advanced CNN-based approaches combined with data pre-processing techniques (e.g., data augmenting) produced superior results. These approaches provide efficient, cost-effective, and innovative solutions to the identification of hydraulic structures. 
    more » « less
  5. Abstract. Most deep learning (DL) methods that are not end-to-end use several multi-scale and multi-type hand-crafted features that make the network challenging, more computationally intensive and vulnerable to overfitting. Furthermore, reliance on empirically-based feature dimensionality reduction may lead to misclassification. In contrast, efficient feature management can reduce storage and computational complexities, builds better classifiers, and improves overall performance. Principal Component Analysis (PCA) is a well-known dimension reduction technique that has been used for feature extraction. This paper presents a two-step PCA based feature extraction algorithm that employs a variant of feature-based PointNet (Qi et al., 2017a) for point cloud classification. This paper extends the PointNet framework for use on large-scale aerial LiDAR data, and contributes by (i) developing a new feature extraction algorithm, (ii) exploring the impact of dimensionality reduction in feature extraction, and (iii) introducing a non-end-to-end PointNet variant for per point classification in point clouds. This is demonstrated on aerial laser scanning (ALS) point clouds. The algorithm successfully reduces the dimension of the feature space without sacrificing performance, as benchmarked against the original PointNet algorithm. When tested on the well-known Vaihingen data set, the proposed algorithm achieves an Overall Accuracy (OA) of 74.64% by using 9 input vectors and 14 shape features, whereas with the same 9 input vectors and only 5PCs (principal components built by the 14 shape features) it actually achieves a higher OA of 75.36% which demonstrates the effect of efficient dimensionality reduction. 
    more » « less