skip to main content


Title: Using publicly available satellite imagery and deep learning to understand economic well-being in Africa
Abstract

Accurate and comprehensive measurements of economic well-being are fundamental inputs into both research and policy, but such measures are unavailable at a local level in many parts of the world. Here we train deep learning models to predict survey-based estimates of asset wealth across ~ 20,000 African villages from publicly-available multispectral satellite imagery. Models can explain 70% of the variation in ground-measured village wealth in countries where the model was not trained, outperforming previous benchmarks from high-resolution imagery, and comparison with independent wealth measurements from censuses suggests that errors in satellite estimates are comparable to errors in existing ground data. Satellite-based estimates can also explain up to 50% of the variation in district-aggregated changes in wealth over time, with daytime imagery particularly useful in this task. We demonstrate the utility of satellite-based estimates for research and policy, and demonstrate their scalability by creating a wealth map for Africa’s most populous country.

 
more » « less
NSF-PAR ID:
10155622
Author(s) / Creator(s):
; ; ; ; ; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Nature Communications
Volume:
11
Issue:
1
ISSN:
2041-1723
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Vegetation indices calculated from remotely sensed satellite imagery are commonly used within empirically derived models to estimate leaf area index in loblolly pine plantations in the southeastern United States. The data used to parameterize the models typically come with observation errors, resulting in biased parameters. The objective of this study was to quantify and reduce the effects of observation errors on a leaf area index (LAI) estimation model using imagery from Landsat 5 TM and 7 ETM+ and over 1500 multitemporal measurements from a Li-Cor 2000 Plant Canopy Analyzer. Study data comes from a 16 quarter 1 ha plot with 1667 trees per hectare (2 m × 3 m spacing) fertilization and irrigation research site with re-measurements taken between 1992 and 2004. Using error-in-variable methods, we evaluated multiple vegetation indices, calculated errors associated with their observations, and corrected for them in the modeling process. We found that the normalized difference moisture index provided the best correlation with below canopy LAI measurements (76.4%). A nonlinear model that accounts for the nutritional status of the stand was found to provide the best estimates of LAI, with a root mean square error of 0.418. The analysis in this research provides a more extensive evaluation of common vegetation indices used to estimate LAI in loblolly pine plantations and a modeling framework that extends beyond the typical linear model. The proposed model provides a simple to use form allowing forest practitioners to evaluate LAI development and its uncertainty in historic pine plantations in a spatial and temporal context. 
    more » « less
  2. Abstract

    In this research, we present data‐driven forecasting of ionospheric total electron content (TEC) using the Long‐Short Term Memory (LSTM) deep recurrent neural network method. The random forest machine learning method was used to perform a regression analysis and estimate the variable importance of the input parameters. The input data are obtained from satellite and ground based measurements characterizing the solar‐terrestrial environment. We estimate the relative importance of 34 different parameters, including the solar flux, solar wind density, and speed the three components of interplanetary magnetic field, Lyman‐alpha, the Kp, Dst, and Polar Cap (PC) indices. The TEC measurements are taken with 15‐s cadence from an equatorial GPS station located at Bogota, Columbia (4.7110° N, 74.0721° W). The 2008–2017 data set, including the top five parameters estimated using the random forest, is used for training the machine learning models, and the 2018 data set is used for independent testing of the LSTM forecasting. The LSTM method as applied to forecast the TEC up to 5 h ahead, with 30‐min cadence. The results indicate that very good forecasts with low root mean square (RMS) error (high correlation) can be made in the near future and the RMS errors increase as we forecast further into the future. The data sources are satellite and ground based measurements characterizing the solar‐terrestrial environment.

     
    more » « less
  3. Abstract

    Neural networks (NN) have become an important tool for prediction tasks—both regression and classification—in environmental science. Since many environmental-science problems involve life-or-death decisions and policy making, it is crucial to provide not only predictions but also an estimate of the uncertainty in the predictions. Until recently, very few tools were available to provide uncertainty quantification (UQ) for NN predictions. However, in recent years the computer-science field has developed numerous UQ approaches, and several research groups are exploring how to apply these approaches in environmental science. We provide an accessible introduction to six of these UQ approaches, then focus on tools for the next step, namely, to answer the question:Once we obtain an uncertainty estimate (using any approach), how do we know whether it is good or bad?To answer this question, we highlight four evaluation graphics and eight evaluation scores that are well suited for evaluating and comparing uncertainty estimates (NN based or otherwise) for environmental-science applications. We demonstrate the UQ approaches and UQ-evaluation methods for two real-world problems: 1) estimating vertical profiles of atmospheric dewpoint (a regression task) and 2) predicting convection over Taiwan based onHimawari-8satellite imagery (a classification task). We also provide Jupyter notebooks with Python code for implementing the UQ approaches and UQ-evaluation methods discussed herein. This article provides the environmental-science community with the knowledge and tools to start incorporating the large number of emerging UQ methods into their research.

    Significance Statement

    Neural networks are used for many environmental-science applications, some involving life-or-death decision-making. In recent years new methods have been developed to provide much-needed uncertainty estimates for NN predictions. We seek to accelerate the adoption of these methods in the environmental-science community with an accessible introduction to 1) methods for computing uncertainty estimates in NN predictions and 2) methods for evaluating such estimates.

     
    more » « less
  4. ArcticDEM provides the public with an unprecedented opportunity to access very high-spatial resolution digital elevation models (DEMs) covering the pan-Arctic surfaces. As it is generated from stereo-pairs of optical satellite imagery, ArcticDEM represents a mixture of a digital surface model (DSM) over a non-ground areas and digital terrain model (DTM) at bare grounds. Reconstructing DTM from ArcticDEM is thus needed in studies requiring bare ground elevation, such as modeling hydrological processes, tracking surface change dynamics, and estimating vegetation canopy height and associated forest attributes. Here we proposed an automated approach for estimating DTM from ArcticDEM in two steps: (1) identifying ground pixels from WorldView-2 imagery using a Gaussian mixture model (GMM) with local refinement by morphological operation, and (2) generating a continuous DTM surface using ArcticDEMs at ground locations and spatial interpolation methods (ordinary kriging (OK) and natural neighbor (NN)). We evaluated our method at three forested study sites characterized by different canopy cover and topographic conditions in Livengood, Alaska, where airborne lidar data is available for validation. Our results demonstrate that (1) the proposed ground identification method can effectively identify ground pixels with much lower root mean square errors (RMSEs) (<0.35 m) to the reference data than the comparative state-of-the-art approaches; (2) NN performs more robustly in DTM interpolation than OK; (3) the DTMs generated from NN interpolation with GMM-based ground masks decrease the RMSEs of ArcticDEM to 0.648 m, 1.677 m, and 0.521 m for Site-1, Site-2, and Site-3, respectively. This study provides a viable means of deriving high-resolution DTM from ArcticDEM that will be of great value to studies focusing on the Arctic ecosystems, forest change dynamics, and earth surface processes. 
    more » « less
  5. Abstract

    Most tundra carbon flux modeling relies on leaf area index (LAI), generally estimated from measurements of canopy greenness using the normalized difference vegetation index (NDVI), to estimate the direction and magnitude of fluxes. However, due to the relative sparseness and low stature of tundra canopies, such models do not explicitly consider the influence of variation in tundra canopy structure on carbon flux estimates. Structure from motion (SFM), a photogrammetric method for deriving three-dimensional (3D) structure from digital imagery, is a non-destructive method for estimating both fine-scale canopy structure and LAI. To understand how variation in 3D canopy structure affects ecosystem carbon fluxes in Arctic tundra, we adapted an existing NDVI-based tundra carbon flux model to include variation in SFM-derived canopy structure and its interaction with incoming sunlight to cast shadows on canopies. Our study system consisted of replicate plots of dry heath tundra that had been subjected to three herbivore exclosure treatments (an exclosure-free control [CT], large mammals exclosure), and a large and small mammal exclosure [ExLS]), providing the range of 3D canopy structures employed in our study. We found that foliage within the more structurally complex surface of CT canopies received significantly less light over the course of the day than canopies within both exclosure treatments. This was especially during morning and evening hours, and was reflected in modeled rates of net ecosystem exchange (NEE) and gross primary productivity (GPP). We found that in the ExLS treatment, SFM-derived estimates of GPP were significantly lower and NEE significantly higher than those based on LAI alone. Our results demonstrate that the structure of even simple tundra vegetation canopies can have significant impacts on tundra carbon fluxes and thus need to be accounted for.

     
    more » « less