skip to main content


Title: A benchmark dataset for canopy crown detection and delineation in co-registered airborne RGB, LiDAR and hyperspectral imagery from the National Ecological Observation Network
Broad scale remote sensing promises to build forest inventories at unprecedented scales. A crucial step in this process is to associate sensor data into individual crowns. While dozens of crown detection algorithms have been proposed, their performance is typically not compared based on standard data or evaluation metrics. There is a need for a benchmark dataset to minimize differences in reported results as well as support evaluation of algorithms across a broad range of forest types. Combining RGB, LiDAR and hyperspectral sensor data from the USA National Ecological Observatory Network’s Airborne Observation Platform with multiple types of evaluation data, we created a benchmark dataset to assess crown detection and delineation methods for canopy trees covering dominant forest types in the United States. This benchmark dataset includes an R package to standardize evaluation metrics and simplify comparisons between methods. The benchmark dataset contains over 6,000 image-annotated crowns, 400 field-annotated crowns, and 3,000 canopy stem points from a wide range of forest types. In addition, we include over 10,000 training crowns for optional use. We discuss the different evaluation data sources and assess the accuracy of the image-annotated crowns by comparing annotations among multiple annotators as well as overlapping field-annotated crowns. We provide an example submission and score for an open-source algorithm that can serve as a baseline for future methods.  more » « less
Award ID(s):
1926542
NSF-PAR ID:
10292069
Author(s) / Creator(s):
; ; ; ; ; ; ;
Editor(s):
Grilli, Jacopo
Date Published:
Journal Name:
PLOS Computational Biology
Volume:
17
Issue:
7
ISSN:
1553-7358
Page Range / eLocation ID:
e1009180
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The ability to automatically delineate individual tree crowns using remote sensing data opens the possibility to collect detailed tree information over large geographic regions. While individual tree crown delineation (ITCD) methods have proven successful in conifer-dominated forests using Light Detection and Ranging (LiDAR) data, it remains unclear how well these methods can be applied in deciduous broadleaf-dominated forests. We applied five automated LiDAR-based ITCD methods across fifteen plots ranging from conifer- to broadleaf-dominated forest stands at Harvard Forest in Petersham, MA, USA, and assessed accuracy against manual delineation of crowns from unmanned aerial vehicle (UAV) imagery. We then identified tree- and plot-level factors influencing the success of automated delineation techniques. There was relatively little difference in accuracy between automated crown delineation methods (51–59% aggregated plot accuracy) and, despite parameter tuning, none of the methods produced high accuracy across all plots (27—90% range in plot-level accuracy). The accuracy of all methods was significantly higher with increased plot conifer fraction, and individual conifer trees were identified with higher accuracy (mean 64%) than broadleaf trees (42%) across methods. Further, while tree-level factors (e.g., diameter at breast height, height and crown area) strongly influenced the success of crown delineations, the influence of plot-level factors varied. The most important plot-level factor was species evenness, a metric of relative species abundance that is related to both conifer fraction and the degree to which trees can fill canopy space. As species evenness decreased (e.g., high conifer fraction and less efficient filling of canopy space), the probability of successful delineation increased. Overall, our work suggests that the tested LiDAR-based ITCD methods perform equally well in a mixed temperate forest, but that delineation success is driven by forest characteristics like functional group, tree size, diversity, and crown architecture. While LiDAR-based ITCD methods are well suited for stands with distinct canopy structure, we suggest that future work explore the integration of phenology and spectral characteristics with existing LiDAR as an approach to improve crown delineation in broadleaf-dominated stands. 
    more » « less
  2. Abstract Aim

    Rapid global change is impacting the diversity of tree species and essential ecosystem functions and services of forests. It is therefore critical to understand and predict how the diversity of tree species is spatially distributed within and among forest biomes. Satellite remote sensing platforms have been used for decades to map forest structure and function but are limited in their capacity to monitor change by their relatively coarse spatial resolution and the complexity of scales at which different dimensions of biodiversity are observed in the field. Recently, airborne remote sensing platforms making use of passive high spectral resolution (i.e., hyperspectral) and active lidar data have been operationalized, providing an opportunity to disentangle how biodiversity patterns vary across space and time from field observations to larger scales. Most studies to date have focused on single sites and/or one sensor type; here we ask how multiple sensor types from the National Ecological Observatory Network’s Airborne Observation Platform (NEON AOP) perform across multiple sites in a single biome at the NEON field plot scale (i.e., 40 m × 40 m).

    Location

    Eastern USA.

    Time period

    2017–2018.

    Taxa studied

    Trees.

    Methods

    With a fusion of hyperspectral and lidar data from the NEON AOP, we assess the ability of high resolution remotely sensed metrics to measure biodiversity variation across eastern US temperate forests. We examine how taxonomic, functional, and phylogenetic measures of alpha diversity vary spatially and assess to what degree remotely sensed metrics correlate with in situ biodiversity metrics.

    Results

    Models using estimates of forest function, canopy structure, and topographic diversity performed better than models containing each category alone. Our results show that canopy structural diversity, and not just spectral reflectance, is critical to predicting biodiversity.

    Main conclusions

    We found that an approach that jointly leverages spectral properties related to leaf and canopy functional traits and forest health, lidar derived estimates of forest structure, fine‐resolution topographic diversity, and careful consideration of biogeographical differences within and among biomes is needed to accurately map biodiversity variation from above.

     
    more » « less
  3. Abstract

    The NeonTreeCrowns dataset is a set of individual level crown estimates for 100 million trees at 37 geographic sites across the United States surveyed by the National Ecological Observation Network’s Airborne Observation Platform. Each rectangular bounding box crown prediction includes height, crown area, and spatial location. 

    How can I see the data?

    A web server to look through predictions is available through idtrees.org

    Dataset Organization

    The shapefiles.zip contains 11,000 shapefiles, each corresponding to a 1km^2 RGB tile from NEON (ID: DP3.30010.001). For example "2019_SOAP_4_302000_4100000_image.shp" are the predictions from "2019_SOAP_4_302000_4100000_image.tif" available from the NEON data portal: https://data.neonscience.org/data-products/explore?search=camera. NEON's file convention refers to the year of data collection (2019), the four letter site code (SOAP), the sampling event (4), and the utm coordinate of the top left corner (302000_4100000). For NEON site abbreviations and utm zones see https://www.neonscience.org/field-sites/field-sites-map. 

    The predictions are also available as a single csv for each file. All available tiles for that site and year are combined into one large site. These data are not projected, but contain the utm coordinates for each bounding box (left, bottom, right, top). For both file types the following fields are available:

    Height: The crown height measured in meters. Crown height is defined as the 99th quartile of all canopy height pixels from a LiDAR height model (ID: DP3.30015.001)

    Area: The crown area in m2 of the rectangular bounding box.

    Label: All data in this release are "Tree".

    Score: The confidence score from the DeepForest deep learning algorithm. The score ranges from 0 (low confidence) to 1 (high confidence)

    How were predictions made?

    The DeepForest algorithm is available as a python package: https://deepforest.readthedocs.io/. Predictions were overlaid on the LiDAR-derived canopy height model. Predictions with heights less than 3m were removed.

    How were predictions validated?

    Please see

    Weinstein, B. G., Marconi, S., Bohlman, S. A., Zare, A., & White, E. P. (2020). Cross-site learning in deep learning RGB tree crown detection. Ecological Informatics56, 101061.

    Weinstein, B., Marconi, S., Aubry-Kientz, M., Vincent, G., Senyondo, H., & White, E. (2020). DeepForest: A Python package for RGB deep learning tree crown delineation. bioRxiv.

    Weinstein, Ben G., et al. "Individual tree-crown detection in RGB imagery using semi-supervised deep learning neural networks." Remote Sensing 11.11 (2019): 1309.

    Were any sites removed?

    Several sites were removed due to poor NEON data quality. GRSM and PUUM both had lower quality RGB data that made them unsuitable for prediction. NEON surveys are updated annually and we expect future flights to correct these errors. We removed the GUIL puerto rico site due to its very steep topography and poor sunangle during data collection. The DeepForest algorithm responded poorly to predicting crowns in intensely shaded areas where there was very little sun penetration. We are happy to make these data are available upon request.

    # Contact

    We welcome questions, ideas and general inquiries. The data can be used for many applications and we look forward to hearing from you. Contact ben.weinstein@weecology.org. 

    Gordon and Betty Moore Foundation: GBMF4563 
    more » « less
  4. This work investigates how different forms of input elicitation obtained from crowdsourcing can be utilized to improve the quality of inferred labels for image classification tasks, where an image must be labeled as either positive or negative depending on the presence/absence of a specified object. Five types of input elicitation methods are tested: binary classification (positive or negative); the ( x, y )-coordinate of the position participants believe a target object is located; level of confidence in binary response (on a scale from 0 to 100%); what participants believe the majority of the other participants' binary classification is; and participant's perceived difficulty level of the task (on a discrete scale). We design two crowdsourcing studies to test the performance of a variety of input elicitation methods and utilize data from over 300 participants. Various existing voting and machine learning (ML) methods are applied to make the best use of these inputs. In an effort to assess their performance on classification tasks of varying difficulty, a systematic synthetic image generation process is developed. Each generated image combines items from the MPEG-7 Core Experiment CE-Shape-1 Test Set into a single image using multiple parameters (e.g., density, transparency, etc.) and may or may not contain a target object. The difficulty of these images is validated by the performance of an automated image classification method. Experiment results suggest that more accurate results can be achieved with smaller training datasets when both the crowdsourced binary classification labels and the average of the self-reported confidence values in these labels are used as features for the ML classifiers. Moreover, when a relatively larger properly annotated dataset is available, in some cases augmenting these ML algorithms with the results (i.e., probability of outcome) from an automated classifier can achieve even higher performance than what can be obtained by using any one of the individual classifiers. Lastly, supplementary analysis of the collected data demonstrates that other performance metrics of interest, namely reduced false-negative rates, can be prioritized through special modifications of the proposed aggregation methods. 
    more » « less
  5. The National Institute of Standards and Technology data science evaluation plant identification challenge is a new periodic competition focused on improving and generalizing remote sensing processing methods for forest landscapes. I created a pipeline to perform three remote sensing tasks. First, a marker-controlled watershed segmentation thresholded by vegetation index and height was performed to identify individual tree crowns within the canopy height model. Second, remote sensing data for segmented crowns was aligned with ground measurements by choosing the set of pairings which minimized error in position and in crown area as predicted by stem height. Third, species classification was performed by reducing the dataset’s dimensionality through principle component analysis and then constructing a set of maximum likelihood classifiers to estimate species likelihoods for each tree. Of the three algorithms, the classification routine exhibited the strongest relative performance, with the segmentation algorithm performing the least well.

     
    more » « less