skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, December 13 until 2:00 AM ET on Saturday, December 14 due to maintenance. We apologize for the inconvenience.


Title: Spatial Ensemble Learning for Heterogeneous Geographic Data with Class Ambiguity: A Summary of Results
Class ambiguity refers to the phenomenon whereby samples with similar features belong to different classes at different locations. Given heterogeneous geographic data with class ambiguity, the spatial ensemble learning (SEL) problem aims to find a decomposition of the geographic area into disjoint zones such that class ambiguity is minimized and a local classifier can be learned in each zone. SEL problem is important for applications such as land cover mapping from heterogeneous earth observation data with spectral confusion. However, the problem is challenging due to its high computational cost (finding an optimal zone partition is NP-hard). Related work in ensemble learning either assumes an identical sample distribution (e.g., bagging, boosting, random forest) or decomposes multi-modular input data in the feature vector space (e.g., mixture of experts, multimodal ensemble), and thus cannot effectively minimize class ambiguity. In contrast, our spatial ensemble framework explicitly partitions input data in geographic space. Our approach first preprocesses data into homogeneous spatial patches and uses a greedy heuristic to allocate pairs of patches with high class ambiguity into different zones. Both theoretical analysis and experimental evaluations on two real world wetland mapping datasets show the feasibility of the proposed approach.  more » « less
Award ID(s):
1737633
PAR ID:
10072946
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
23rd International Conference on Advances in GIS , ACM SIGSPATIAL
Volume:
23
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Flood inundation mapping from Earth imagery plays a vital role in rapid disaster response and national water forecasting. However, the problem is non-trivial due to significant imagery noise and obstacles, complex spatial dependency on 3D terrains, spatial non-stationarity, and high computational cost. Existing machine learning approaches are mostly terrain-unaware and are prone to produce spurious results due to imagery noise and obstacles, requiring significant efforts in post-processing. Recently, several terrain- aware methods were proposed that incorporate complex spatial dependency (e.g., water flow directions on 3D terrains) but they assume that the inferred flood surface level is spatially stationary, making them insufficient for a large heterogeneous geographic area. To address these limitations, this paper proposes a novel spatial learning framework called hidden Markov forest, which decomposes a large heterogeneous area into local stationary zones, represents spatial dependency on 3D terrains via zonal trees (forest), and jointly infers the class map in different zonal trees with spatial regularization. We design efficient inference algorithms based on dynamic programming and multi-resolution filtering. Evaluations on real-world datasets show that our method outperforms baselines and our proposed computational refinement significantly reduces the time cost. 
    more » « less
  2. Modeling fluid flow and transport in heterogeneous systems is often challenged by unknown parameters that vary in space. In inverse modeling, measurement data are used to estimate these parameters. Due to the spatial variability of these unknown parameters in heterogeneous systems (e.g., permeability or diffusivity), the inverse problem is ill-posed and infinite solutions are possible. Physics-informed neural networks (PINN) have become a popular approach for solving inverse problems. However, in inverse problems in heterogeneous systems, PINN can be sensitive to hyperparameters and can produce unrealistic patterns. Motivated by the concept of ensemble learning and variance reduction in machine learning, we propose an ensemble PINN (ePINN) approach where an ensemble of parallel neural networks is used and each sub-network is initialized with a meaningful pattern of the unknown parameter. Subsequently, these parallel networks provide a basis that is fed into a main neural network that is trained using PINN. It is shown that an appropriately selected set of patterns can guide PINN in producing more realistic results that are relevant to the problem of interest. To assess the accuracy of this approach, inverse transport problems involving unknown heat conductivity, porous media permeability, and velocity vector fields were studied. The proposed ePINN approach was shown to increase the accuracy in inverse problems and mitigate the challenges associated with non-uniqueness. 
    more » « less
  3. Spatial variability is a prominent feature of various geographic phenomena such as climatic zones, USDA plant hardiness zones, and terrestrial habitat types (e.g., forest, grasslands, wetlands, and deserts). However, current deep learning methods follow a spatial-one-size-fits-all (OSFA) approach to train single deep neural network models that do not account for spatial variability. Quantification of spatial variability can be challenging due to the influence of many geophysical factors. In preliminary work, we proposed a spatial variability aware neural network (SVANN-I, formerly called SVANN ) approach where weights are a function of location but the neural network architecture is location independent. In this work, we explore a more flexible SVANN-E approach where neural network architecture varies across geographic locations. In addition, we provide a taxonomy of SVANN types and a physics inspired interpretation model. Experiments with aerial imagery based wetland mapping show that SVANN-I outperforms OSFA and SVANN-E performs the best of all. 
    more » « less
  4. Abstract

    Many studies of Earth surface processes and landscape evolution rely on having accurate and extensive data sets of surficial geologic units and landforms. Automated extraction of geomorphic features using deep learning provides an objective way to consistently map landforms over large spatial extents. However, there is no consensus on the optimal input feature space for such analyses. We explore the impact of input feature space for extracting geomorphic features from land surface parameters (LSPs) derived from digital terrain models (DTMs) using convolutional neural network (CNN)‐based semantic segmentation deep learning. We compare four input feature space configurations: (a) a three‐layer composite consisting of a topographic position index (TPI) calculated using a 50 m radius circular window, square root of topographic slope, and TPI calculated using an annulus with a 2 m inner radius and 10 m outer radius, (b) a single illuminating position hillshade, (c) a multidirectional hillshade, and (d) a slopeshade. We test each feature space input using three deep learning algorithms and four use cases: two with natural features and two with anthropogenic features. The three‐layer composite generally provided lower overall losses for the training samples, a higher F1‐score for the withheld validation data, and better performance for generalizing to withheld testing data from a new geographic extent. Results suggest that CNN‐based deep learning for mapping geomorphic features or landforms from LSPs is sensitive to input feature space. Given the large number of LSPs that can be derived from DTM data and the variety of geomorphic mapping tasks that can be undertaken using CNN‐based methods, we argue that additional research focused on feature space considerations is needed and suggest future research directions. We also suggest that the three‐layer composite implemented here can offer better performance in comparison to using hillshades or other common terrain visualization surfaces and is, thus, worth considering for different mapping and feature extraction tasks.

     
    more » « less
  5. Given earth imagery with spectral features on a terrain surface, this paper studies surface segmentation based on both explanatory features and surface topology. The problem is important in many spatial and spatiotemporal applications such as flood extent mapping in hydrology. The problem is uniquely challenging for several reasons: first, the size of earth imagery on a terrain surface is often much larger than the input of popular deep convolutional neural networks; second, there exists topological structure dependency between pixel classes on the surface, and such dependency can follow an unknown and non-linear distribution; third, there are often limited training labels. Existing methods for earth imagery segmentation often divide the imagery into patches and consider the elevation as an additional feature channel. These methods do not fully incorporate the spatial topological structural constraint within and across surface patches and thus often show poor results, especially when training labels are limited. Existing methods on semi-supervised and unsupervised learning for earth imagery often focus on learning representation without explicitly incorporating surface topology. In contrast, we propose a novel framework that explicitly models the topological skeleton of a terrain surface with a contour tree from computational topology, which is guided by the physical constraint (e.g., water flow direction on terrains). Our framework consists of two neural networks: a convolutional neural network (CNN) to learn spatial contextual features on a 2D image grid, and a graph neural network (GNN) to learn the statistical distribution of physics-guided spatial topological dependency on the contour tree. The two models are co-trained via variational EM. Evaluations on the real-world flood mapping datasets show that the proposed models outperform baseline methods in classification accuracy, especially when training labels are limited. 
    more » « less