skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 2333243

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. piqtree is an easy to use, open-source Python package that directly exposes IQ-TREE’s phylogenetic inference engine. It offers Python functions for performing many of IQ-TREE’s capabilities including phylogenetic reconstruction, ultrafast bootstrapping, branch length optimisation, ModelFinder, rapid neighbour-joining, and more. By exposing IQ-TREE’s algorithms within Python, piqtree greatly simplifies the development of new phylogenetic workflows through seamless interoperability with other Python libraries and tools mediated by the cogent3 package. It also enables users to perform interactive analyses with IQ-TREE through, for instance, Jupyter notebooks. We present the key features available in the piqtree library and a small case study that showcases its interoperability. The piqtree library can be installed withpip install piqtree, with the documentation available at https://piqtree.readthedocs.io and source at https://github.com/iqtree/piqtree. 
    more » « less
    Free, publicly-accessible full text available July 16, 2026
  2. IQ-TREE (http://www.iqtree.org) is a widely used open-source software tool for efficiently inferring phylogenetic trees under maximum likelihood. Here, we present IQ-TREE version 3, the third major release of the software. IQ-TREE 3 significantly extends version 2 with new features, including mixture models as an alternative to partitioned models, gene and site concordance factors to quantify discordance between genomic regions, and a fully-featured sequence simulator. The IQ-TREE 3 source code is available at https://github.com/iqtree/iqtree3. 
    more » « less
    Free, publicly-accessible full text available April 7, 2026
  3. The current “consensus” order in which amino acids were added to the genetic code is based on potentially biased criteria, such as the absence of sulfur-containing amino acids from the Urey–Miller experiment which lacked sulfur. More broadly, abiotic abundance might not reflect biotic abundance in the organisms in which the genetic code evolved. Here, we instead identify which protein domains date to the last universal common ancestor (LUCA) and then infer the order of recruitment from deviations of their ancestrally reconstructed amino acid frequencies from the still-ancient post-LUCA controls. We find that smaller amino acids were added to the code earlier, with no additional predictive power in the previous consensus order. Metal-binding (cysteine and histidine) and sulfur-containing (cysteine and methionine) amino acids were added to the genetic code much earlier than previously thought. Methionine and histidine were added to the code earlier than expected from their molecular weights and glutamine later. Early methionine availability is compatible with inferred early use of S-adenosylmethionine and early histidine with its purine-like structure and the demand for metal binding. Even more ancient protein sequences—those that had already diversified into multiple distinct copies prior to LUCA—have significantly higher frequencies of aromatic amino acids (tryptophan, tyrosine, phenylalanine, and histidine) and lower frequencies of valine and glutamic acid than single-copy LUCA sequences. If at least some of these sequences predate the current code, then their distinct enrichment patterns provide hints about earlier, alternative genetic codes. 
    more » « less
    Free, publicly-accessible full text available December 24, 2025
  4. Nearly neutral theory predicts that species with higher effective population size (N_e) are better at purging slightly deleterious mutations. We compare evolution in high N_e vs. low-N_e vertebrates to reveal subtle selective preferences among amino acids. We take three complementary approaches. First, we fit non-stationary substitution models using maximum likelihood, comparing the high-N_e clade of rodents and lagomorphs to its low-N_e sister clade of primates and colugos. Second, we compared evolutionary outcomes across a wider range of vertebrates, via correlations between amino acid frequencies and N_e. Third, we dissected which amino acids substitutions occurred in human, chimpanzee, mouse, and rat, as scored by parsimony – this also enabled comparison to a historical paper. All methods agree on amino acid preference under more effective selection. Preferred amino acids are less costly to synthesize and use GC-rich codons, which are hard to maintain under AT-biased mutation. These factors explain 85% of the variance in amino acid preferences. Parsimony-induced bias in the historical study produces an apparent reduction in structural disorder, perhaps driven by slightly deleterious substitutions in rapidly evolving regions. Within highly exchangeable pairs of amino acids, arginine is strongly preferred over lysine, aspartate over glutamate, and valine over isoleucine, consistent with more effective selection preferring a marginally larger free energy of folding. Two of these preferences (K→R and I→V), but not a third (E→D) match differences between thermophiles and mesophilic relatives. These results reveal the biophysical consequences of mutation-selection-drift balance, and demonstrate the utility of nearly neutral theory for understanding protein evolution. 
    more » « less