skip to main content


Title: phenotools: An r package for visualizing and analysing phenomic datasets
Abstract

Phenotypic data are crucial for understanding genotype–phenotype relationships, assessing the tree of life and revealing trends in trait diversity over time. Large‐scale description of whole organisms for quantitative analyses (phenomics) presents several challenges, and technological advances in the collection of genomic data outpace those for phenomic data. Reasons for this disparity include the time‐consuming and expensive nature of collecting discrete phenotypic data and mining previously published data on a given species (both often requiring anatomical expertise across taxa), and computational challenges involved with analysing high‐dimensional datasets.

One approach to building approximations of organismal phenomes is to combine published datasets of discrete characters assembled for phylogenetic analyses into a phenomic dataset. Despite a wealth of legacy datasets in the literature for many groups, relatively few methods exist for automating the assembly, analysis, and visualization of phenomic datasets in phylogenetic contexts. Here, we introduce a newrpackagephenotoolsfor integrating (fusing original or legacy datasets), curating (finding and removing duplicates) and visualizing phenomic datasets.

We demonstrate the utility of the proposed toolkit with a morphological dataset for flightless birds and two morphological datasets for theropod dinosaurs and provide recommendations for character construction to maximize accessibility in future workflows. Visualization tools allow rapid identification of anatomical subregions with difficult or problematic histories of homology.

We anticipate these tools aiding automation of the assembly and visualization of phenomic datasets to inform evolutionary relationships and rates of phenotypic evolution.

 
more » « less
PAR ID:
10447903
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Methods in Ecology and Evolution
Volume:
10
Issue:
9
ISSN:
2041-210X
Page Range / eLocation ID:
p. 1393-1400
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Reconstructing ancestral states for discrete characters is essential for understanding trait evolution in organisms. However, most existing methods are limited to individual characters and often overlook the hierarchical and interactive nature of traits. Recent advances in phylogenetics now offer the possibility of integrating knowledge from anatomy ontologies to reconstruct multiple discrete character histories. Nonetheless, practical applications that fully harness the potential of these new approaches are still lacking.

    This paper introducesontophylo, an R package that extends the PARAMO pipeline to address these limitations.Ontophyloenables the reconstruction of phenotypic entities composed of amalgamated characters, such as anatomical regions or entire phenomes. It offers three new applications: (1) reconstruction of evolutionary rates of amalgamated characters using phylogenetic non‐homogeneous Poisson process (pNHPP) that allows modelling rate variation across tree branches and time; (2) reconstruction of morphospace dynamics; and (3) visualization of evolutionary rates on vector images of organisms.Ontophyloincorporates ontological knowledge to facilitate these applications.

    Benchmarking confirms the accuracy of pNHPP in estimating character rates under different evolutionary scenarios, and example applications demonstrate the utility ofontophyloin studying morphological evolution in Hymenoptera using simulated data.

    Ontophylocan be easily integrated with other ontology‐oriented and general‐purpose R packages and offers new opportunities to examine morphological evolution on a phenomic scale using new and legacy data.

     
    more » « less
  2. Abstract

    Comprehensive, time‐scaled phylogenies provide a critical resource for many questions in ecology, evolution and biodiversity. Methodological advances have increased the breadth of taxonomic coverage in phylogenetic data; however, accessing and reusing these data remain challenging.

    We introduce the Fish Tree of Life website and associatedrpackagefishtreeto provide convenient access to sequences, phylogenies, fossil calibrations and diversification rate estimates for the most diverse group of vertebrate organisms, the ray‐finned fishes. The Fish Tree of Life website presents subsets and visual summaries of phylogenetic and comparative data, and is complemented by therpackage, which provides flexible programmatic access to the same underlying data source for advanced users wishing to extend or reanalyse the data.

    We demonstrate functionality with an overview of the website, and show three examples of advanced usage through therpackage. First, we test for the presence of long branch attraction artefacts across the fish tree of life. The second example examines the effects of habitat on diversification rate in the pufferfishes. The final example demonstrates how a community phylogenetic analysis could be conducted with the package.

    This resource makes a large comparative vertebrate dataset easily accessible via the website, while therpackage enables the rapid reuse and reproducibility of research results via its ability to easily integrate with otherrpackages and software for molecular biology and comparative methods.

     
    more » « less
  3. Abstract

    Many important demographic processes are seasonal, including survival. For many species, mortality risk is significantly higher at certain times of the year than at others, whether because resources are scarce, susceptibility to predators or disease is high, or both. Despite the importance of survival modelling in wildlife sciences, no tools are available to estimate the peak, duration and relative importance of these ‘seasons of mortality’.

    We presentcyclomort, anrpackage that estimates the timing, duration and intensity of any number of mortality seasons with reliable confidence intervals. The package includes a model selection approach to determine the number of mortality seasons and to test whether seasons of mortality vary across discrete grouping factors.

    We illustrate the periodic hazard function model and workflow of cyclomort with simulated data. We then estimate mortality seasons of two caribouRangifer taranduspopulations that have strikingly different mortality patterns, including different numbers and timing of mortality peaks, and a marked change in one population over time.

    Thecyclomortpackage was developed to estimate mortality seasons for wildlife, but the package can model any time‐to‐event processes with a periodic component.

     
    more » « less
  4. Abstract

    Organismal anatomy is a hierarchical system of anatomical entities often imposing dependencies among multiple morphological characters. Ontologies provide a formal and computable framework for incorporating prior biological knowledge about anatomical dependencies in models of trait evolution. They also offer new opportunities for working with semantic representations of morphological data.

    In this work, we present a new R package—rphenoscate—that enables incorporating ontological knowledge in evolutionary analyses and exploring semantic patterns of morphological data. In conjunction withrphenoscape, it allows for assembling synthetic phylogenetic character matrices from semantic phenotypes of morphological data. We showcase the package functionality with data sets from bees and fishes.

    We demonstrate that ontologies can be employed to automatically set up evolutionary models accounting for trait dependencies in stochastic character mapping. We also demonstrate how ontology annotations can be explored to interrogate patterns of morphological evolution. Finally, we demonstrate that synthetic character matrices assembled from semantic phenotypes retain most of the phylogenetic information from their original data sets.

    Ontologies will become important tools for integrating anatomical knowledge into phylogenetic methods and making morphological data FAIR compliant—a critical step of the ongoing ‘phenomics’ revolution. Our new package offers key advancements towards this goal.

     
    more » « less
  5. Summary

    The evolution oflDOPA4,5‐dioxygenase activity, encoded by the geneDODA, was a key step in the origin of betalain biosynthesis in Caryophyllales. We previously proposed thatlDOPA4,5‐dioxygenase activity evolved via a single Caryophyllales‐specific neofunctionalisation event within theDODAgene lineage. However, this neofunctionalisation event has not been confirmed and theDODAgene lineage exhibits numerous gene duplication events, whose evolutionary significance is unclear.

    To address this, we functionally characterised 23 distinctDODAproteins forlDOPA4,5‐dioxygenase activity, from four betalain‐pigmented and five anthocyanin‐pigmented species, representing key evolutionary transitions across Caryophyllales. By mapping these functional data to an updatedDODAphylogeny, we then explored the evolution oflDOPA4,5‐dioxygenase activity.

    We find that lowlDOPA4,5‐dioxygenase activity is distributed across theDODAgene lineage. In this context, repeated gene duplication events within theDODAgene lineage give rise to polyphyletic occurrences of elevatedlDOPA4,5‐dioxygenase activity, accompanied by convergent shifts in key functional residues and distinct genomic patterns of micro‐synteny.

    In the context of an updated organismal phylogeny and newly inferred pigment reconstructions, we argue that repeated convergent acquisition of elevatedlDOPA4,5‐dioxygenase activity is consistent with recurrent specialisation to betalain synthesis in Caryophyllales.

     
    more » « less