Abstract MotivationBiodiversity in many areas is rapidly declining because of global change. As such, there is an urgent need for new tools and strategies to help identify, monitor and conserve biodiversity hotspots. This is especially true for frugivores, species consuming fruit, because of their important role in seed dispersal and maintenance of forest structure and health. One way to identify these areas is by quantifying functional diversity, which measures the unique roles of species within a community and is valuable for conservation because of its relationship with ecosystem functioning. Unfortunately, the functional trait information required for these studies can be sparse for certain taxa and specific traits and difficult to harmonize across disparate data sources, especially in biodiversity hotspots. To help fill this need, we compiled Frugivoria, a trait database containing ecological, life‐history, morphological and geographical traits for mammals and birds exhibiting frugivory. Frugivoria encompasses species in contiguous moist montane forests and adjacent moist lowland forests of Central and South America—the latter specifically focusing on the Andean states. Compared with existing trait databases, Frugivoria harmonizes existing trait databases, adds new traits, extends traits originally only available for mammals to birds also and fills gaps in trait categories from other databases. Furthermore, we create a cross‐taxa subset of shared traits to aid in analysis of mammals and birds. In total, Frugivoria adds 8662 new trait values for mammals and 14,999 for birds and includes a total of 45,216 trait entries with only 11.37% being imputed. Frugivoria also contains an open workflow that harmonizes trait and taxonomic data from disparate sources and enables users to analyse traits in space. As such, this open‐access database, which aligns with FAIR data principles, fills a major knowledge gap, enabling more comprehensive trait‐based studies of species in this ecologically important region. Main Types of Variable ContainedEcological, life‐history, morphological and geographical traits. Spatial Location and GrainNeotropical countries (Mexico, Guatemala, Costa Rica, Panama, El Salvador, Belize, Nicaragua, Ecuador, Colombia, Peru, Bolivia, Argentina, Venezuela and Chile) with contiguous montane regions. Time Period and GrainIUCN spatial data: obtained February 2023, spanning range maps collated from 1998 to 2022. IUCN species data: obtained June 2019–September 2022. Newly included traits: span 1924 to 2023. Major Taxa and Level of MeasurementClasses Mammalia and Aves; 40,074 species‐level traits; 5142 imputed traits for 1733 species (mammals: 582; birds: 1147) and 16 sub‐species (mammals). Software Format.csv; R.
more »
« less
FloraTraiter: Automated parsing of traits from descriptive biodiversity literature
Abstract PremisePlant trait data are essential for quantifying biodiversity and function across Earth, but these data are challenging to acquire for large studies. Diverse strategies are needed, including the liberation of heritage data locked within specialist literature such as floras and taxonomic monographs. Here we report FloraTraiter, a novel approach using rule‐based natural language processing (NLP) to parse computable trait data from biodiversity literature. MethodsFloraTraiter was implemented through collaborative work between programmers and botanical experts and customized for both online floras and scanned literature. We report a strategy spanning optical character recognition, recognition of taxa, iterative building of traits, and establishing linkages among all of these, as well as curational tools and code for turning these results into standard morphological matrices. ResultsOver 95% of treatment content was successfully parsed for traits with <1% error. Data for more than 700 taxa are reported, including a demonstration of common downstream uses. ConclusionsWe identify strategies, applications, tips, and challenges that we hope will facilitate future similar efforts to produce large open‐source trait data sets for broad community reuse. Largely automated tools like FloraTraiter will be an important addition to the toolkit for assembling trait data at scale.
more »
« less
- Award ID(s):
- 1916632
- PAR ID:
- 10574164
- Publisher / Repository:
- Wiley
- Date Published:
- Journal Name:
- Applications in Plant Sciences
- Volume:
- 12
- Issue:
- 1
- ISSN:
- 2168-0450
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract MotivationTraits are increasingly being used to quantify global biodiversity patterns, with trait databases growing in size and number, across diverse taxa. Despite growing interest in a trait‐based approach to the biodiversity of the deep sea, where the impacts of human activities (including seabed mining) accelerate, there is no single repository for species traits for deep‐sea chemosynthesis‐based ecosystems, including hydrothermal vents. Using an international, collaborative approach, we have compiled the first global‐scale trait database for deep‐sea hydrothermal‐vent fauna – sFDvent (sDiv‐funded trait database for theFunctionalDiversity ofvents). We formed a funded working group to select traits appropriate to: (a) capture the performance of vent species and their influence on ecosystem processes, and (b) compare trait‐based diversity in different ecosystems. Forty contributors, representing expertise across most known hydrothermal‐vent systems and taxa, scored species traits using online collaborative tools and shared workspaces. Here, we characterise the sFDvent database, describe our approach, and evaluate its scope. Finally, we compare the sFDvent database to similar databases from shallow‐marine and terrestrial ecosystems to highlight how the sFDvent database can inform cross‐ecosystem comparisons. We also make the sFDvent database publicly available online by assigning a persistent, unique DOI. Main types of variable containedSix hundred and forty‐six vent species names, associated location information (33 regions), and scores for 13 traits (in categories: community structure, generalist/specialist, geographic distribution, habitat use, life history, mobility, species associations, symbiont, and trophic structure). Contributor IDs, certainty scores, and references are also provided. Spatial location and grainGlobal coverage (grain size: ocean basin), spanning eight ocean basins, including vents on 12 mid‐ocean ridges and 6 back‐arc spreading centres. Time period and grainsFDvent includes information on deep‐sea vent species, and associated taxonomic updates, since they were first discovered in 1977. Time is not recorded. The database will be updated every 5 years. Major taxa and level of measurementDeep‐sea hydrothermal‐vent fauna with species‐level identification present or in progress. Software format.csv and MS Excel (.xlsx).more » « less
-
Abstract PremiseQuantitative plant traits play a crucial role in biological research. However, traditional methods for measuring plant morphology are time consuming and have limited scalability. We present LeafMachine2, a suite of modular machine learning and computer vision tools that can automatically extract a base set of leaf traits from digital plant data sets. MethodsLeafMachine2 was trained on 494,766 manually prepared annotations from 5648 herbarium images obtained from 288 institutions and representing 2663 species; it employs a set of plant component detection and segmentation algorithms to isolate individual leaves, petioles, fruits, flowers, wood samples, buds, and roots. Our landmarking network automatically identifies and measures nine pseudo‐landmarks that occur on most broadleaf taxa. Text labels and barcodes are automatically identified by an archival component detector and are prepared for optical character recognition methods or natural language processing algorithms. ResultsLeafMachine2 can extract trait data from at least 245 angiosperm families and calculate pixel‐to‐metric conversion factors for 26 commonly used ruler types. DiscussionLeafMachine2 is a highly efficient tool for generating large quantities of plant trait data, even from occluded or overlapping leaves, field images, and non‐archival data sets. Our project, along with similar initiatives, has made significant progress in removing the bottleneck in plant trait data acquisition from herbarium specimens and shifted the focus toward the crucial task of data revision and quality control.more » « less
-
Abstract AimUnderstanding how ecological communities are assembled remains a grand challenge in ecology with direct implications for charting the future of biodiversity. Trait‐based methods have emerged as the leading approach for quantifying functional community structure (convergence, divergence) but their potential for inferring assembly processes rests on accurately measuring functional dissimilarity among community members. Here, we argue that trait resolution (from finest‐resolution continuous measurements to coarsest‐resolution binary categories) remains a critically overlooked methodological variable, even though categorical classification is known to mask functional variability and inflate functional redundancy among species or individuals. InnovationWe present the first detailed predictions of trait resolution biases and demonstrate, with simulations, how the distortion of signal strength by increasingly coarse‐resolution traits can fundamentally alter functional structure patterns and the interpretation of causative ecological processes (e.g. abiotic filters, biotic interactions). We show that coarser trait data impart different impacts on the signals of divergence and convergence, implying that the role of biotic interactions may be underestimated when using coarser traits. Furthermore, in some systems, coarser traits may overestimate the strength of trait convergence, leading to erroneous support for abiotic processes as the primary drivers of community assembly or change. Main conclusionsInferences of assembly processes must account for trait resolution to ensure robust conclusions, especially for broad‐scale studies of comparative community assembly and biodiversity change. Despite recent improvements in the collection and availability of trait data, great disparities continue to exist among taxa in the number and availability of continuous traits, which are more difficult to acquire for large numbers of species than coarse categorical assignments. Based on our simulations, we urge the consideration of trait resolution in the design and interpretation of community assembly studies and suggest a suite of practical solutions to address the pitfalls of trait resolution biases.more » « less
-
Abstract Background and AimsPooideae grasses contain some of the world’s most important crop and forage species. Although much work has been conducted on understanding the genetic basis of trait diversification within a few annual Pooideae, comparative studies at the subfamily level are limited by a lack of perennial models outside ‘core’ Pooideae. We argue for development of the perennial non-core genus Melica as an additional model for Pooideae, and provide foundational data regarding the group’s biogeography and history of character evolution. MethodsSupplementing available ITS and ndhF sequence data, we built a preliminary Bayesian-based Melica phylogeny, and used it to understand how the genus has diversified in relation to geography, climate and trait variation surveyed from various floras. We also determine biomass accumulation under controlled conditions for Melica species collected across different latitudes and compare inflorescence development across two taxa for which whole genome data are forthcoming. Key ResultsOur phylogenetic analyses reveal three strongly supported geographically structured Melica clades that are distinct from previously hypothesized subtribes. Despite less geographical affinity between clades, the two sister ‘Ciliata’ and ‘Imperfecta’ clades segregate from the more phylogenetically distant ‘Nutans’ clade in thermal climate variables and precipitation seasonality, with the ‘Imperfecta’ clade showing the highest levels of trait variation. Growth rates across Melica are positively correlated with latitude of origin. Variation in inflorescence morphology appears to be explained largely through differences in secondary branch distance, phyllotaxy and number of spikelets per secondary branch. ConclusionsThe data presented here and in previous studies suggest that Melica possesses many of the necessary features to be developed as an additional model for Pooideae grasses, including a relatively fast generation time, perenniality, and interesting variation in physiology and morphology. The next step will be to generate a genome-based phylogeny and transformation tools for functional analyses.more » « less
An official website of the United States government

