skip to main content


This content will become publicly available on June 1, 2024

Title: Deep Spatial Prediction via Heterogeneous Multi-Source Self-Supervision
Spatial prediction is to predict the values of the targeted variable, such as PM2.5 values and temperature, at arbitrary locations based on the collected geospatial data. It greatly affects the key research topics in geoscience in terms of obtaining heterogeneous spatial information (e.g., soil conditions, precipitation rates, wheat yields) for geographic modeling and decision-making at local, regional, and global scales. In-situ data, collected by ground-level in-situ sensors, and remote sensing data, collected by satellite or aircraft, are two important data sources for this task. In-situ data are relatively accurate while sparse and unevenly distributed. Remote sensing data cover large spatial areas but are coarse with low spatiotemporal resolution and prone to interference. How to synergize the complementary strength of these two data types is still a grand challenge. Moreover, it is difficult to model the unknown spatial predictive mapping while handling the trade-off between spatial autocorrelation and heterogeneity. Third, representing spatial relations without substantial information loss is also a critical issue. To address these challenges, we propose a novel Heterogeneous Self-supervised Spatial Prediction (HSSP) framework that synergizes multi-source data by minimizing the inconsistency between in-situ and remote sensing observations. We propose a new deep geometric spatial interpolation model as the prediction backbone that automatically interpolates the values of the targeted variable at unknown locations based on existing observations by taking into account both distance and orientation information. Our proposed interpolator is proven to both be the general form of popular interpolation methods and preserve spatial information. The spatial prediction is enhanced by a novel error-compensation framework to capture the prediction inconsistency due to spatial heterogeneity. Extensive experiments have been conducted on real-world datasets and demonstrated our model’s superiority in performance over state-of-the-art models.  more » « less
Award ID(s):
2113350 2318831 2103592 1907805 1942594
NSF-PAR ID:
10434550
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
ACM Transactions on Spatial Algorithms and Systems
ISSN:
2374-0353
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The movement of animals is strongly influenced by external factors in their surrounding environment such as weather, habitat types, and human land use. With advances in positioning and sensor technologies, it is now possible to capture animal locations at high spatial and temporal granularities. Likewise, modern space-based remote sensing technology provides us with an increasing access to large volumes of environmental data, some of which changes on an hourly basis. Environmental data are heterogeneous in source and format, and are usually obtained at different scales and granularities than movement data. Indeed, there remain scientific and technical challenges in developing linkages between the growing collections of animal movement data and the large repositories of heterogeneous remote sensing observations, as well as in the developments of new statistical and computational methods for the analysis of movement in its environmental context. These challenges include retrieval, indexing, efficient storage, data integration, and analytic techniques. We have developed a new system - the Environmental-Data Automated Track Annotation (Env-DATA) - that automates annotation of movement trajectories with remote-sensing environmental information, including high resolution topography, weather from global and regional reanalysis datasets, climatology, human geography, ocean currents and productivity, land use, vegetation and land surface variables, precipitation, fire, and other global datasets. The system automates the acquisition of data from open web resources of remote sensing and weather data and provides several interpolation methods from the native grid resolution and structure to a global regular grid linked with the movement tracks in space and time. Env-DATA provides an easy-to-use platform for end users that eliminates technical difficulties of the annotation processes, including data acquisition, data transformation and integration, resampling, interpolation and interpretation. The new Env-DATA system enhances Movebank (www.movebank.org), an open portal of animal tracking data. The aim is to facilitate new understanding and predictive capabilities of spatiotemporal patterns of animal movement in response to dynamic and changing environments from local to global scales. The system is already in use by scientists worldwide, and by several conservation managers, such as the consortium of federal and private institution that manage the endangered Californian Condor populations. 
    more » « less
  2. With the popularity of smartphones, large-scale road sensing data is being collected to perform traffic prediction, which is an important task in modern society. Due to the nature of the roving sensors on smartphones, the collected traffic data which is in the form of multivariate time series, is often temporally sparse and unevenly distributed across regions. Moreover, different regions can have different traffic patterns, which makes it challenging to adapt models learned from regions with sufficient training data to target regions. Given that many regions may have very sparse data, it is also impossible to build individual models for each region separately. In this paper, we propose a meta-learning based framework named MetaTP to overcome these challenges. MetaTP has two key parts, i.e., basic traffic prediction network (base model) and meta-knowledge transfer. In base model, a two-layer interpolation network is employed to map original time series onto uniformly-spaced reference time points, so that temporal prediction can be effectively performed in the reference space. The meta-learning framework is employed to transfer knowledge from source regions with a large amount of data to target regions with a few data examples via fast adaptation, in order to improve model generalizability on target regions. Moreover, we use two memory networks to capture the global patterns of spatial and temporal information across regions. We evaluate the proposed framework on two real-world datasets, and experimental results show the effectiveness of the proposed framework. 
    more » « less
  3. Abstract Particle filters avoid parametric estimates for Bayesian posterior densities, which alleviates Gaussian assumptions in nonlinear regimes. These methods, however, are more sensitive to sampling errors than Gaussian-based techniques such as ensemble Kalman filters. A recent study by the authors introduced an iterative strategy for particle filters that match posterior moments—where iterations improve the filter’s ability to draw samples from non-Gaussian posterior densities. The iterations follow from a factorization of particle weights, providing a natural framework for combining particle filters with alternative filters to mitigate the impact of sampling errors. The current study introduces a novel approach to forming an adaptive hybrid data assimilation methodology, exploiting the theoretical strengths of nonparametric and parametric filters. At each data assimilation cycle, the iterative particle filter performs a sequence of updates while the prior sample distribution is non-Gaussian, then an ensemble Kalman filter provides the final adjustment when Gaussian distributions for marginal quantities are detected. The method employs the Shapiro–Wilk test to determine when to make the transition between filter algorithms, which has outstanding power for detecting departures from normality. Experiments using low-dimensional models demonstrate that the approach has a significant value, especially for nonhomogeneous observation networks and unknown model process errors. Moreover, hybrid factors are extended to consider marginals of more than one collocated variables using a test for multivariate normality. Findings from this study motivate the use of the proposed method for geophysical problems characterized by diverse observation networks and various dynamic instabilities, such as numerical weather prediction models. Significance Statement Data assimilation statistically processes observation errors and model forecast errors to provide optimal initial conditions for the forecast, playing a critical role in numerical weather forecasting. The ensemble Kalman filter, which has been widely adopted and developed in many operational centers, assumes Gaussianity of the prior distribution and solves a linear system of equations, leading to bias in strong nonlinear regimes. On the other hand, particle filters avoid many of those assumptions but are sensitive to sampling errors and are computationally expensive. We propose an adaptive hybrid strategy that combines their advantages and minimizes the disadvantages of the two methods. The hybrid particle filter–ensemble Kalman filter is achieved with the Shapiro–Wilk test to detect the Gaussianity of the ensemble members and determine the timing of the transition between these filter updates. Demonstrations in this study show that the proposed method is advantageous when observations are heterogeneous and when the model has an unknown bias. Furthermore, by extending the statistical hypothesis test to the test for multivariate normality, we consider marginals of more than one collocated variable. These results encourage further testing for real geophysical problems characterized by various dynamic instabilities, such as real numerical weather prediction models. 
    more » « less
  4. Abstract

    The 2015 spring flood of the Sagavanirktok River inundated large swaths of tundra as well as infrastructure near Prudhoe Bay, Alaska. Its lasting impact on permafrost, vegetation, and hydrology is unknown but compels attention in light of changing Arctic flood regimes. We combined InSAR and optical satellite observations to quantify subdecadal permafrost terrain changes and identify their controls. While the flood locally induced quasi‐instantaneous ice‐wedge melt, much larger areas were characterized by subtle, spatially variable post‐flood changes. Surface deformation from 2015 to 2019 estimated from ALOS‐2 and Sentinel‐1 InSAR varied substantially within and across terrain units, with greater subsidence on average in flooded locations. Subsidence exceeding 5 cm was locally observed in inundated ice‐rich units and also in inactive floodplains. Overall, subsidence increased with deposit age and thus ground ice content, but many flooded ice‐rich units remained stable, indicating variable drivers of deformation. On average, subsiding ice‐rich locations showed increases in observed greenness and wetness. Conversely, many ice‐poor floodplains greened without deforming. Ice wedge degradation in flooded locations with elevated subsidence was mostly of limited intensity, and the observed subsidence largely stopped within 2 years. Based on remote sensing and limited field observations, we propose that the disparate subdecadal changes were influenced by spatially variable drivers (e.g., sediment deposition, organic layer), controls (ground ice and its degree of protection), and feedback processes. Remote sensing helps quantify the heterogeneous interactions between permafrost, vegetation, and hydrology across permafrost‐affected fluvial landscapes. Interdisciplinary monitoring is needed to improve predictions of landscape dynamics and to constrain sediment, nutrient, and carbon budgets.

     
    more » « less
  5. Abstract

    Observed ecological responses to climate change are highly individualistic across species and locations, and understanding the drivers of this variability is essential for management and conservation efforts. While it is clear that differences in exposure, sensitivity, and adaptive capacity all contribute to heterogeneity in climate change vulnerability, predicting these features at macroecological scales remains a critical challenge. We explore multiple drivers of heterogeneous vulnerability across the distributions of 96 vegetation types of the ecologically diverse western US, using data on observed climate trends from 1948 to 2014 to highlight emerging patterns of change. We ask three novel questions about factors potentially shaping vulnerability across the region: (a) How does sensitivity to different climate variables vary geographically and across vegetation classes? (b) How do multivariate climate exposure patterns interact with these sensitivities to shape vulnerability patterns? (c) How different are these vulnerability patterns according to three widely implemented vulnerability paradigms—niche novelty (decline in modeled suitability), temporal novelty (standardized anomaly), and spatial novelty (inbound climate velocity)—each of which uses a distinct frame of reference to quantify climate departure? We propose that considering these three novelty paradigms in combination could help improve our understanding and prediction of heterogeneous climate change responses, and we discuss the distinct climate adaptation strategies connected with different combinations of high and low novelty across the three metrics. Our results reveal a diverse mosaic of climate change vulnerability signatures across the region's plant communities. Each of the above factors contributes strongly to this heterogeneity: climate variable sensitivity exhibits clear patterns across vegetation types, multivariate climate change data reveal highly diverse exposure signatures across locations, and the three novelty paradigms diverge widely in their climate change vulnerability predictions. Together, these results shed light on potential drivers of individualistic climate change responses and may help to inform effective management strategies.

     
    more » « less