Abstract Understanding the ranges of rare and endangered species is central to conserving biodiversity in the Anthropocene. Species distribution models (SDMs) have become a common and powerful tool for analyzing species–environment relationships across geographic space. Although evaluating the distribution of rare species is integral to their conservation, this can be difficult when limited distribution data are available. Community science platforms, such as iNaturalist, have emerged as alternative sources for species occurrence data. Although these observations are often thought to be of lower quality than those of natural history collections, they may have potential for improving SDMs for species with few occurrence records from collections. Here, we investigate the utility of iNaturalist data for developing SDMs for a rare high‐elevation plant,Telesonix jamesii. Because methods for modeling rare species are limited in the literature, five different modeling techniques were considered, including profile methods, statistical models, and machine learning algorithms. The inclusion of iNaturalist data doubled the number of usable records forT. jamesii.We found that a random forest (RF) model using ensemble training data performed the highest of any model (area under curve = 0.98). We then compared the performance of RF models that use only natural history training data and those that use a combination of natural history (herbarium specimens) and iNaturalist training data. All models heavily relied on climate data (mean temperature of driest quarter, and precipitation of the warmest quarter), indicating that this species is under threat as climate continues to change. Validation datasets affected model fits as well. Models using only herbarium data performed slightly poorer when evaluated with cross‐validation than when validated externally with iNaturalist data. This study can serve as a model for future SDM studies of species with similar data limitations. 
                        more » 
                        « less   
                    
                            
                            Quantifying error in occurrence data: Comparing the data quality of iNaturalist and digitized herbarium specimen data in flowering plant families of the southeastern United States
                        
                    
    
            iNaturalist has the potential to be an extremely rich source of organismal occurrence data. Launched in 2008, it now contains over 150 million uploaded observations as of May 2023. Based on the findings of a limited number of past studies assessing the taxonomic accuracy of participatory science-driven sources of occurrence data such as iNaturalist, there has been concern that some portion of these records might be misidentified in certain taxonomic groups. In this case study, we compare Research Grade iNaturalist observations with digitized herbarium specimens, both of which are currently available for combined download from large data aggregators and are therefore the primary sources of occurrence data for large-scale biodiversity/biogeography studies. Our comparisons were confined regionally to the southeastern United States (Florida, Georgia, North Carolina, South Carolina, Texas, Tennessee, Kentucky, and Virginia). Occurrence records from ten plant families (Gentianaceae, Ericaceae, Melanthiaceae, Ulmaceae, Fabaceae, Asteraceae, Fagaceae, Cyperaceae, Juglandaceae, Apocynaceae) were downloaded and scored on taxonomic accuracy. We found a comparable and relatively low rate of misidentification among both digitized herbarium specimens and Research Grade iNaturalist observations within the study area. This finding illustrates the utility and high quality of iNaturalist data for future research in the region, but also points to key differences between data types, giving each a respective advantage, depending on applications of the data. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2027654
- PAR ID:
- 10518722
- Editor(s):
- Qin, Hong
- Publisher / Repository:
- NSF Public Access Repository (NSF-PAR)
- Date Published:
- Journal Name:
- PLOS ONE
- Volume:
- 18
- Issue:
- 12
- ISSN:
- 1932-6203
- Page Range / eLocation ID:
- e0295298
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Abstract PremiseDigitized biodiversity data offer extensive information; however, obtaining and processing biodiversity data can be daunting. Complexities arise during data cleaning, such as identifying and removing problematic records. To address these issues, we created the R package Geographic And Taxonomic Occurrence R‐based Scrubbing (gatoRs). Methods and ResultsThe gatoRs workflow includes functions that streamline downloading records from the Global Biodiversity Information Facility (GBIF) and Integrated Digitized Biocollections (iDigBio). We also created functions to clean downloaded specimen records. Unlike previous R packages, gatoRs accounts for differences in download structure between GBIF and iDigBio and allows for user control via interactive cleaning steps. ConclusionsOur pipeline enables the scientific community to process biodiversity data efficiently and is accessible to the R coding novice. We anticipate that gatoRs will be useful for both established and beginning users. Furthermore, we expect our package will facilitate the introduction of biodiversity‐related concepts into the classroom via the use of herbarium specimens.more » « less
- 
            Native bee species in the United States provide invaluable pollination services. Concerns about native bee declines are growing, and there are calls for a national monitoring program. Documenting species ranges at ecologically meaningful scales through coverage completeness analysis is a fundamental step to track bees from species to communities. It may take decades before all existing bee specimens are digitized, so projections are needed now to focus future research and management efforts. From 1.923 million records, we created range maps for nearly 88% (3158 species) of bee species in the contiguous United States, provided the first analysis of inventory completeness for digitized specimens of a major insect clade, and perhaps most important, estimated spatial completeness accounting for all known bee specimens in USA collections, including undigitized bee specimens. Completeness analyses were very low (3–37%) across four examined spatial resolutions when using the currently available bee specimen records. Adding a subset of observations from community science data sources did not significantly increase completeness, and adding a projected 4.7 million undigitized specimens increased completeness by only an additional 12–13%. Assessments of data, including projected specimen records, indicate persistent taxonomic and geographic deficiencies. In conjunction with expedited digitization, new inventories that integrate community science data with specimen‐based documentation will be required to close these gaps. A combined effort involving both strategic inventories and accelerated digitization campaigns is needed for a more complete understanding of USA bee distributions.more » « less
- 
            Abstract PremisePteridophytes—vascular land plants that disperse by spores—are a powerful system for studying plant evolution, particularly with respect to the impact of abiotic factors on evolutionary trajectories through deep time. However, our ability to use pteridophytes to investigate such questions—or to capitalize on the ecological and conservation‐related applications of the group—has been impaired by the relative isolation of the neo‐ and paleobotanical research communities and by the absence of large‐scale biodiversity data sources. MethodsHere we present the Pteridophyte Collections Consortium (PCC), an interdisciplinary community uniting neo‐ and paleobotanists, and the associated PteridoPortal, a publicly accessible online portal that serves over three million pteridophyte records, including herbarium specimens, paleontological museum specimens, and iNaturalist observations. We demonstrate the utility of the PteridoPortal through discussion of three example PteridoPortal‐enabled research projects. ResultsThe data within the PteridoPortal are global in scope and are queryable in a flexible manner. The PteridoPortal contains a taxonomic thesaurus (a digital version of a Linnaean classification) that includes both extant and extinct pteridophytes in a common phylogenetic framework. The PteridoPortal allows applications such as greatly accelerated classic floristics, entirely new “next‐generation” floristic approaches, and the study of environmentally mediated evolution of functional morphology across deep time. DiscussionThe PCC and PteridoPortal provide a comprehensive resource enabling novel research into plant evolution, ecology, and conservation across deep time, facilitating rapid floristic analyses and other biodiversity‐related investigations, and providing new opportunities for education and community engagement.more » « less
- 
            Plant phenology has been shifting dramatically in response to climate change, a shift that may have significant and widespread ecological consequences. Of particular concern are tropical biomes, which represent the most biodiverse and imperiled regions of the world. However, compared to temperate floras, we know little about phenological responses of tropical plants because long-term observational datasets from the tropics are sparse. Herbarium specimens have greatly increased our phenological knowledge in temperate regions, but similar data have been underutilized in the tropics and their suitability for this purpose has not been broadly validated. Here, we compare phenological estimates derived from field observational data (i.e., plot surveys) and herbarium specimens at various spatial and taxonomic scales to determine whether specimens can provide accurate estimations of reproductive timing and its spatial variation. Here we demonstrate that phenological estimates from field observations and herbarium specimens coincide well. Fewer than 5% of the species exhibited significant differences between flowering periods inferred from field observations versus specimens regardless of spatial aggregation. In contrast to studies based on field records, herbarium specimens sampled much larger geographic and climatic ranges, as has been documented previously for temperate plants, and effectively captured phenological responses across varied environments. Herbarium specimens are verified to be a vital resource for closing the gap in our phenological knowledge of tropical systems. Tropical plant reproductive phenology inferred from herbarium records are widely congruent with field observations, suggesting that they can (and should) be used to investigate phenological variation and their associated environmental cues more broadly across tropical biomes.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    