skip to main content


Title: Large‐scale, image‐based tree species mapping in a tropical forest using artificial perceptual learning
Abstract

Information about the spatial distribution of species lies at the heart of many important questions in ecology. Logistical limitations and collection biases, however, limit the availability of such data at ecologically relevant scales. Remotely sensed information can alleviate some of these concerns, but presents challenges associated with accurate species identification and limited availability of field data for validation, especially in high diversity ecosystems such as tropical forests.

Recent advances in machine learning offer a promising and cost‐efficient approach for gathering a large amount of species distribution data from aerial photographs. Here, we propose a novel machine learning framework, artificial perceptual learning (APL), to tackle the problem of weakly supervised pixel‐level mapping of tree species in forests. Challenges arise from limited availability of ground labels for tree species, lack of precise segmentation of tree canopies and misalignment between visible canopies in the aerial images and stem locations associated with ground labels. The proposed APL framework addresses these challenges by constructing a workflow using state‐of‐the‐art machine learning algorithms.

We develop and illustrate the proposed framework by implementing a fine‐grain mapping of three species, the palmPrestoea acuminataand the tree speciesCecropia schreberianaandManilkara bidentata, over a 5,000‐ha area of El Yunque National Forest in Puerto Rico. These large‐scale maps are based on unlabelled high‐resolution aerial images of unsegmented tree canopies. Misaligned ground‐based labels, available for <1% of these images, serve as the only weak supervision. APL performance is evaluated using ground‐based labels and high‐quality human segmentation using Amazon Mechanical Turk, and compared to a basic workflow that relies solely on labelled images.

Receiver operating characteristic (ROC) curves and Intersection over Union (IoU) metrics demonstrate that APL substantially outperforms the basic workflow and attains human‐level cognitive economy, with 50‐fold time savings. For the palm andC. schreberiana, the APL framework has high pixelwise accuracy and IoU with reference to human segmentations. ForM.bidentata, APL predictions are congruent with ground‐based labels. Our approach shows great potential for leveraging existing data from global forest plot networks coupled with aerial imagery to map tree species at ecologically meaningful spatial scales.

 
more » « less
Award ID(s):
1831952
NSF-PAR ID:
10452933
Author(s) / Creator(s):
 ;  ;  ;  ;  ;
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Methods in Ecology and Evolution
Volume:
12
Issue:
4
ISSN:
2041-210X
Page Range / eLocation ID:
p. 608-618
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Anthropogenic activities have altered historical disturbance regimes, and understanding the mechanisms by which these shifting perturbations interact is essential to predicting where they may erode ecosystem resilience. Emerging infectious plant diseases, caused by human translocation of nonnative pathogens, can generate ecologically damaging forms of novel biotic disturbance. Further, abiotic disturbances, such as wildfire, may influence the severity and extent of disease‐related perturbations via their effects on the occurrence of hosts, pathogens and microclimates; however, these interactions have rarely been examined.

    The disease ‘sudden oak death’ (SOD), associated with the introduced pathogenPhytophthora ramorum, causes acute, landscape‐scale tree mortality in California's fire‐prone coastal forests. Here, we examined interactions between wildfire and the biotic disturbance impacts of this emerging infectious disease. Leveraging long‐term datasets that describe wildfire occurrence andP. ramorumdynamics across the Big Sur region, we modelled the influence of recent and historical fires on epidemiological parameters, including pathogen presence, infestation intensity, reinvasion, and host mortality.

    Past wildfire altered disease dynamics and reduced SOD‐related mortality, indicating a negative interaction between these abiotic and biotic disturbances. Frequently burned forests were less likely to be invaded byP. ramorum, had lower incidence of host infection, and exhibited decreased disease‐related biotic disturbance, which was associated with reduced occurrence and density of epidemiologically significant hosts. Following a recent wildfire, survival of mature bay laurel, a key sporulating host, was the primary driver ofP. ramoruminfestation and reinvasion, but younger, rapidly regenerating host vegetation capable of sporulation did not measurably influence disease dynamics. Notably, the effect ofP. ramoruminfection on host mortality was reduced in recently burned areas, indicating that the loss of tall, mature host canopies may temporarily dampen pathogen transmission and ‘release’ susceptible species from significant inoculum pressure.

    Synthesis. Cumulatively, our findings indicate that fire history has contributed to heterogeneous patterns of biotic disturbance and disease‐related decline across this landscape, via changes to the both the occurrence of available hosts and the demography of epidemiologically important host populations. These results highlight that human‐altered abiotic disturbances may play a foundational role in structuring infectious disease dynamics, contributing to future outbreak emergence and driving biotic disturbance regimes.

     
    more » « less
  2. Summary

    Trade‐offs among carbon sinks constrain how trees physiologically, ecologically, and evolutionarily respond to their environments. These trade‐offs typically fall along a productive growth to conservative, bet‐hedging continuum. How nonstructural carbohydrates (NSCs) stored in living tree cells (known as carbon stores) fit in this trade‐off framework is not well understood.

    We examined relationships between growth and storage using both within species genetic variation from a common garden, and across species phenotypic variation from a global database.

    We demonstrate that storage is actively accumulated, as part of a conservative, bet‐hedging life history strategy. Storage accumulates at the expense of growth both within and across species. Within the speciesPopulus trichocarpa, genetic trade‐offs show that for each additional unit of wood area growth (in cm2 yr−1) that genotypes invest in, they lose 1.2 to 1.7 units (mg g−1NSC) of storage. Across species, for each additional unit of area growth (in cm2 yr−1), trees, on average, reduce their storage by 9.5% in stems and 10.4% in roots.

    Our findings impact our understanding of basic plant biology, fit storage into a widely used growth‐survival trade‐off spectrum describing life history strategy, and challenges the assumptions of passive storage made in ecosystem models today.

     
    more » « less
  3. Abstract

    Understanding the mechanisms that promote the coexistence of hundreds of species over small areas in tropical forest remains a challenge. Many tropical tree species are presumed to be functionally equivalent shade tolerant species but exist on a continuum of performance trade‐offs between survival in shade and the ability to quickly grow in sunlight. These trade‐offs can promote coexistence by reducing fitness differences.

    Variation in plant functional traits related to resource acquisition is thought to predict variation in performance among species, perhaps explaining community assembly across habitats with gradients in resource availability. Many studies have found low predictive power, however, when linking trait measurements to species demographic rates.

    Seedlings face different challenges recruiting on the forest floor and may exhibit different traits and/or performance trade‐offs than older individuals face in the eventual adult niche. Seed mass is the typical proxy for seedling success, but species also differ in cotyledon strategy (reserve vs. photosynthetic) or other leaf, stem and root traits. These can cause species with the same average seed mass to have divergent performance in the same habitat.

    We combined long‐term studies of seedling dynamics with functional trait data collected at a standard life‐history stage in three diverse neotropical forests to ask whether variation in coordinated suites of traits predicts variation among species in demographic performance.

    Across hundreds of species in Ecuador, Panama and Puerto Rico, we found seedlings displayed correlated suites of leaf, stem, and root traits, which strongly correlated with seed mass and cotyledon strategy. Variation among species in seedling functional traits, seed mass, and cotyledon strategy were strong predictors of trade‐offs in seedling growth and survival. These results underscore the importance of matching the ontogenetic stage of the trait measurement to the stage of demographic dynamics.

    Our findings highlight the importance of cotyledon strategy in addition to seed mass as a key component of seed and seedling biology in tropical forests because of the contribution of carbon reserves in storage cotyledons to reducing mortality rates and explaining the growth‐survival trade‐off among species.

    Synthesis: With strikingly consistent patterns across three tropical forests, we find strong evidence for the promise of functional traits to provide mechanistic links between seedling form and demographic performance.

     
    more » « less
  4. Abstract Background

    Lung cancer is the deadliest and second most common cancer in the United States due to the lack of symptoms for early diagnosis. Pulmonary nodules are small abnormal regions that can be potentially correlated to the occurrence of lung cancer. Early detection of these nodules is critical because it can significantly improve the patient's survival rates. Thoracic thin‐sliced computed tomography (CT) scanning has emerged as a widely used method for diagnosing and prognosis lung abnormalities.

    Purpose

    The standard clinical workflow of detecting pulmonary nodules relies on radiologists to analyze CT images to assess the risk factors of cancerous nodules. However, this approach can be error‐prone due to the various nodule formation causes, such as pollutants and infections. Deep learning (DL) algorithms have recently demonstrated remarkable success in medical image classification and segmentation. As an ever more important assistant to radiologists in nodule detection, it is imperative ensure the DL algorithm and radiologist to better understand the decisions from each other. This study aims to develop a framework integrating explainable AI methods to achieve accurate pulmonary nodule detection.

    Methods

    A robust and explainable detection (RXD) framework is proposed, focusing on reducing false positives in pulmonary nodule detection. Its implementation is based on an explanation supervision method, which uses nodule contours of radiologists as supervision signals to force the model to learn nodule morphologies, enabling improved learning ability on small dataset, and enable small dataset learning ability. In addition, two imputation methods are applied to the nodule region annotations to reduce the noise within human annotations and allow the model to have robust attributions that meet human expectations. The 480, 265, and 265 CT image sets from the public Lung Image Database Consortium and Image Database Resource Initiative (LIDC‐IDRI) dataset are used for training, validation, and testing.

    Results

    Using only 10, 30, 50, and 100 training samples sequentially, our method constantly improves the classification performance and explanation quality of baseline in terms of Area Under the Curve (AUC) and Intersection over Union (IoU). In particular, our framework with a learnable imputation kernel improves IoU from baseline by 24.0% to 80.0%. A pre‐defined Gaussian imputation kernel achieves an even greater improvement, from 38.4% to 118.8% from baseline. Compared to the baseline trained on 100 samples, our method shows less drop in AUC when trained on fewer samples. A comprehensive comparison of interpretability shows that our method aligns better with expert opinions.

    Conclusions

    A pulmonary nodule detection framework was demonstrated using public thoracic CT image datasets. The framework integrates the robust explanation supervision (RES) technique to ensure the performance of nodule classification and morphology. The method can reduce the workload of radiologists and enable them to focus on the diagnosis and prognosis of the potential cancerous pulmonary nodules at the early stage to improve the outcomes for lung cancer patients.

     
    more » « less
  5. The spatial distribution of forest stands is one of the fundamental properties of forests. Timely and accurately obtained stand distribution can help people better understand, manage, and utilize forests. The development of remote sensing technology has made it possible to map the distribution of tree species in a timely and accurate manner. At present, a large amount of remote sensing data have been accumulated, including high-spatial-resolution images, time-series images, light detection and ranging (LiDAR) data, etc. However, these data have not been fully utilized. To accurately identify the tree species of forest stands, various and complementary data need to be synthesized for classification. A curve matching based method called the fusion of spectral image and point data (FSP) algorithm was developed to fuse high-spatial-resolution images, time-series images, and LiDAR data for forest stand classification. In this method, the multispectral Sentinel-2 image and high-spatial-resolution aerial images were first fused. Then, the fused images were segmented to derive forest stands, which are the basic unit for classification. To extract features from forest stands, the gray histogram of each band was extracted from the aerial images. The average reflectance in each stand was calculated and stacked for the time-series images. The profile curve of forest structure was generated from the LiDAR data. Finally, the features of forest stands were compared with training samples using curve matching methods to derive the tree species. The developed method was tested in a forest farm to classify 11 tree species. The average accuracy of the FSP method for ten performances was between 0.900 and 0.913, and the maximum accuracy was 0.945. The experiments demonstrate that the FSP method is more accurate and stable than traditional machine learning classification methods. 
    more » « less