Abstract Biogenic volatile organic compounds (VOCs) constitute a significant portion of gas-phase metabolites in modern ecosystems and have unique roles in moderating atmospheric oxidative capacity, solar radiation balance, and aerosol formation. It has been theorized that VOCs may account for observed geological and evolutionary phenomena during the Archaean, but the direct contribution of biology to early non-methane VOC cycling remains unexplored. Here, we provide an assessment of all potential VOCs metabolized by the last universal common ancestor (LUCA). We identify enzyme functions linked to LUCA orthologous protein groups across eight literature sources and estimate the volatility of all associated substrates to identify ancient volatile metabolites. We hone in on volatile metabolites with confirmed modern emissions that exist in conserved metabolic pathways and produce a curated list of the most likely LUCA VOCs. We introduce volatile organic metabolites associated with early life and discuss their potential influence on early carbon cycling and atmospheric chemistry.
more »
« less
A New Analysis of Archaea–Bacteria Domain Separation: Variable Phylogenetic Distance and the Tempo of Early Evolution
Abstract Comparative genomics and molecular phylogenetics are foundational for understanding biological evolution. Although many studies have been made with the aim of understanding the genomic contents of early life, uncertainty remains. A study by Weiss et al. (Weiss MC, Sousa FL, Mrnjavac N, Neukirchen S, Roettger M, Nelson-Sathi S, Martin WF. 2016. The physiology and habitat of the last universal common ancestor. Nat Microbiol. 1(9):16116.) identified a number of protein families in the last universal common ancestor of archaea and bacteria (LUCA) which were not found in previous works. Here, we report new research that suggests the clustering approaches used in this previous study undersampled protein families, resulting in incomplete phylogenetic trees which do not reflect protein family evolution. Phylogenetic analysis of protein families which include more sequence homologs rejects a simple LUCA hypothesis based on phylogenetic separation of the bacterial and archaeal domains for a majority of the previously identified LUCA proteins (∼82%). To supplement limitations of phylogenetic inference derived from incompletely populated orthologous groups and to test the hypothesis of a period of rapid evolution preceding the separation of the domains, we compared phylogenetic distances both within and between domains, for thousands of orthologous groups. We find a substantial diversity of interdomain versus intradomain branch lengths, even among protein families which exhibit a single domain separating branch and are thought to be associated with the LUCA. Additionally, phylogenetic trees with long interdomain branches relative to intradomain branches are enriched in information categories of protein families in comparison to those associated with metabolic functions. These results provide a new view of protein family evolution and temper claims about the phenotype and habitat of the LUCA.
more »
« less
- Award ID(s):
- 1724300
- PAR ID:
- 10155144
- Date Published:
- Journal Name:
- Molecular Biology and Evolution
- ISSN:
- 0737-4038
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Hejnol, Andreas (Ed.)Molecular evolution studies, such as phylogenomic studies and genome-wide surveys of selection, often rely on gene families of single-copy orthologs (SC-OGs). Large gene families with multiple homologs in 1 or more species—a phenomenon observed among several important families of genes such as transporters and transcription factors—are often ignored because identifying and retrieving SC-OGs nested within them is challenging. To address this issue and increase the number of markers used in molecular evolution studies, we developed OrthoSNAP, a software that uses a phylogenetic framework to simultaneously split gene families into SC-OGs and prune species-specific inparalogs. We term SC-OGs identified by OrthoSNAP as SNAP-OGs because they are identified using a s plitti n g a nd p runing procedure analogous to snapping branches on a tree. From 415,129 orthologous groups of genes inferred across 7 eukaryotic phylogenomic datasets, we identified 9,821 SC-OGs; using OrthoSNAP on the remaining 405,308 orthologous groups of genes, we identified an additional 10,704 SNAP-OGs. Comparison of SNAP-OGs and SC-OGs revealed that their phylogenetic information content was similar, even in complex datasets that contain a whole-genome duplication, complex patterns of duplication and loss, transcriptome data where each gene typically has multiple transcripts, and contentious branches in the tree of life. OrthoSNAP is useful for increasing the number of markers used in molecular evolution data matrices, a critical step for robustly inferring and exploring the tree of life.more » « less
-
The current “consensus” order in which amino acids were added to the genetic code is based on potentially biased criteria, such as the absence of sulfur-containing amino acids from the Urey–Miller experiment which lacked sulfur. More broadly, abiotic abundance might not reflect biotic abundance in the organisms in which the genetic code evolved. Here, we instead identify which protein domains date to the last universal common ancestor (LUCA) and then infer the order of recruitment from deviations of their ancestrally reconstructed amino acid frequencies from the still-ancient post-LUCA controls. We find that smaller amino acids were added to the code earlier, with no additional predictive power in the previous consensus order. Metal-binding (cysteine and histidine) and sulfur-containing (cysteine and methionine) amino acids were added to the genetic code much earlier than previously thought. Methionine and histidine were added to the code earlier than expected from their molecular weights and glutamine later. Early methionine availability is compatible with inferred early use of S-adenosylmethionine and early histidine with its purine-like structure and the demand for metal binding. Even more ancient protein sequences—those that had already diversified into multiple distinct copies prior to LUCA—have significantly higher frequencies of aromatic amino acids (tryptophan, tyrosine, phenylalanine, and histidine) and lower frequencies of valine and glutamic acid than single-copy LUCA sequences. If at least some of these sequences predate the current code, then their distinct enrichment patterns provide hints about earlier, alternative genetic codes.more » « less
-
Significance Known examples of life all share the same core biochemistry going back to the last universal common ancestor (LUCA), but whether this feature is universal to other examples, including at the origin of life or alien life, is unknown. We show how a physics-inspired statistical approach identifies universal scaling laws across biochemical reactions that are not defined by common chemical components but instead, as macroscale patterns in the reaction functions used by life. The identified scaling relations can be used to predict statistical features of LUCA, and network analyses reveal some of the functional principles that underlie them. They are, therefore, prime candidates for developing new theory on the “laws of life” that might apply to all possible biochemistries.more » « less
-
Cerretti, Pierfilippo (Ed.)The schizophoran superfamily Ephydroidea (Diptera: Cyclorrhapha) includes eight families, ranging from the well-known vinegar flies (Drosophilidae) and shore flies (Ephydridae), to several small, relatively unusual groups, the phylogenetic placement of which has been particularly challenging for systematists. An extraordinary diversity in life histories, feeding habits and morphology are a hallmark of fly biology, and the Ephydroidea are no exception. Extreme specialization can lead to “orphaned” taxa with no clear evidence for their phylogenetic position. To resolve relationships among a diverse sample of Ephydroidea, including the highly modified flies in the families Braulidae and Mormotomyiidae, we conducted phylogenomic sampling. Using exon capture from Anchored Hybrid Enrichment and transcriptomics to obtain 320 orthologous nuclear genes sampled for 32 species of Ephydroidea and 11 outgroups, we evaluate a new phylogenetic hypothesis for representatives of the superfamily. These data strongly support monophyly of Ephydroidea with Ephydridae as an early branching radiation and the placement of Mormotomyiidae as a family-level lineage sister to all remaining families. We confirm placement of Cryptochetidae as sister taxon to a large clade containing both Drosophilidae and Braulidae–the latter a family of honeybee ectoparasites. Our results reaffirm that sampling of both taxa and characters is critical in hyperdiverse clades and that these factors have a major influence on phylogenomic reconstruction of the history of the schizophoran fly radiation.more » « less
An official website of the United States government

