skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on June 10, 2026

Title: Convergent expansions of keystone gene families drive metabolic innovation in Saccharomycotina yeasts
Many remarkable phenotypes have repeatedly occurred across vast evolutionary distances. When convergent traits emerge on the tree of life, they are sometimes driven by the same underlying gene families, while other times, many different gene families are involved. Conversely, a gene family may be repeatedly recruited for a single trait or many different traits. To understand the general rules governing convergence at both genomic and phenotypic levels, we systematically tested associations between 56 binary metabolic traits and gene count in 14,785 gene families from 993 Saccharomycotina yeasts. Using a recently developed phylogenetic approach that reduces spurious correlations, we found that gene family expansion and contraction were significantly linked to trait gain and loss in 45/56 (80%) traits. While 595/739 (81%) significant gene families were associated with only one trait, we also identified several “keystone” gene families that were significantly associated with up to 13/56 (23%) of all traits. Strikingly, most of these families are known to encode metabolic enzymes and transporters, including all members of the industrially relevantMALtose fermentation loci in the baker’s yeastSaccharomyces cerevisiae. These results indicate that convergent evolution on the gene family level may be more widespread across deeper timescales than previously believed.  more » « less
Award ID(s):
2110404 2110403
PAR ID:
10608674
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
PNAS
Date Published:
Journal Name:
Proceedings of the National Academy of Sciences
Volume:
122
Issue:
23
ISSN:
0027-8424
Page Range / eLocation ID:
e2500165122
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We present a novel symbolic reasoning engine for SQL which can efficiently generate an inputIfornqueriesP1, ⋯,Pn, such that their outputs onIsatisfy a given property (expressed in SMT). This is useful in different contexts, such as disproving equivalence of two SQL queries and disambiguating a set of queries. Our first idea is to reason about an under-approximation of eachPi— that is, a subset ofPi’s input-output behaviors. While it makes our approach both semantics-aware and lightweight, this idea alone is incomplete (as a fixed under-approximation might miss some behaviors of interest). Therefore, our second idea is to perform search over an expressive family of under-approximations (which collectively cover all program behaviors of interest), thereby making our approach complete. We have implemented these ideas in a tool, Polygon, and evaluated it on over 30,000 benchmarks across two tasks (namely, SQL equivalence refutation and query disambiguation). Our evaluation results show that Polygon significantly outperforms all prior techniques. 
    more » « less
  2. Albright, Michaeline_B N (Ed.)
    ABSTRACT Microorganisms often inhabit environments that are suboptimal for growth and reproduction. To survive when challenged by such conditions, individuals engage in dormancy, where they enter a metabolically inactive state. For this persistence strategy to confer an evolutionary advantage, microorganisms must be able to resuscitate and reproduce when conditions improve. Among bacteria in the phylum Actinomycetota, dormancy can be terminated by resuscitation-promoting factor (Rpf), an exoenzyme that hydrolyzes glycosidic bonds in the peptidoglycan of cell walls. We characterized Rpf fromMicrococcusKBS0714, a bacterium isolated from agricultural soil. The protein exhibited high substrate affinityin vitro, even though resuscitation was maximized in live-cell assays at micromolar concentrations. Site-directed mutations at conserved catalytic sites significantly reduced or eliminated resuscitation, as did the deletion of repeating motifs in a lectin-encoding linker region. We then tested the effects of recombinant Rpf fromMicrococcusKBS0714 on a diverse set of dormant soil bacteria. Patterns of resuscitation mapped onto strain phylogeny, which reflected core features of the cell envelope. Additionally, the direction and magnitude of the Rpf effect were associated with functional traits, in particular, aspects of the moisture niche and biofilm production, which are critical for understanding dormancy and the persistence of microbial populations in soils. These findings expand our understanding of how Rpf may affect seed bank dynamics with implications for the diversity and functioning of microorganisms in terrestrial ecosystems. IMPORTANCEDormancy is a process whereby individuals enter a reversible state of reduced metabolic activity. In fluctuating environments, dormancy protects individuals from unfavorable conditions, enhancing fitness and buffering populations against extinction. However, waking up from dormancy is a critical yet risky decision. Some bacteria resuscitate stochastically, while others rely on environmental cues or signals from neighboring cells to transition back to active growth. Resuscitation-promoting factor (Rpf) is an exoenzyme that cleaves bonds in the peptidoglycan of bacterial cell walls, facilitating dormancy termination and enabling regrowth. Although this family of proteins has been well characterized in model organisms and clinically relevant strains, our study characterizes Rpf from a soil bacterium and examines its effects on resuscitation across a diverse collection of bacteria, linking it to functional traits that may influence dormancy dynamics in both natural and managed ecosystems. 
    more » « less
  3. The relationship between genotype and phenotype remains an outstanding question for organism-level traits because these traits are generallycomplex. The challenge arises from complex traits being determined by a combination of multiple genes (or loci), which leads to an explosion of possible genotype–phenotype mappings. The primary techniques to resolve these mappings are genome/transcriptome-wide association studies, which are limited by their lack of causal inference and statistical power. Here, we develop an approach that combines transcriptional data endowed with causal information and a generative machine learning model designed to strengthen statistical power. Our implementation of the approach—dubbed transcriptome-wide conditional variational autoencoder (TWAVE)—includes a variational autoencoder trained on human transcriptional data, which is incorporated into an optimization framework. Given a trait phenotype, TWAVE generates expression profiles, which we dimensionally reduce by identifying independently varying generalized pathways (eigengenes). We then conduct constrained optimization to find causal gene sets that are the gene perturbations whose measured transcriptomic responses best explain trait phenotype differences. By considering several complex traits, we show that the approach identifies causal genes that cannot be detected by the primary existing techniques. Moreover, the approach identifies complex diseases caused by distinct sets of genes, meaning that the disease is polygenicandexhibits distinct subtypes driven by different genotype–phenotype mappings. We suggest that the approach will enable the design of tailored experiments to identify multigenic targets to address complex diseases. 
    more » « less
  4. Zhang, Ying (Ed.)
    ABSTRACT Treponema pallidum, the causative agent of syphilis, poses a significant global health threat. Its strict reliance on host-derived nutrients and difficulties inin vitrocultivation have impeded detailed metabolic characterization. In this study, we present iTP251, the first genome-scale metabolic model ofT. pallidum, reconstructed and extensively curated to capture its unique metabolic features. These refinements included the curation of key reactions such as pyrophosphate-dependent phosphorylation and pathways for nucleotide synthesis, amino acid synthesis, and cofactor metabolism. The model demonstrated high predictive accuracy, validated by a MEMOTE score of 92%. To further enhance its predictive capabilities, we developed ec-iTP251, an enzyme-constrained version of iTP251, incorporating enzyme turnover rate and molecular weight information for all reactions having gene-protein-reaction associations. Ec-iTP251 provides detailed insights into protein allocation across carbon sources, showing strong agreement with proteomics data (Pearson’s correlation of 0.88) in the central carbon pathway. Moreover, the thermodynamic analysis revealed that lactate uptake serves as an additional ATP-generating strategy to utilize unused proteomes, albeit at the cost of reducing the driving force of the central carbon pathway by 27%. Subsequent analysis identified glycerol-3-phosphate dehydrogenase as an alternative electron sink, compensating for the absence of a conventional electron transport chain while maintaining cellular redox balance. These findings highlightT. pallidum’s metabolic adaptations for survival and redox balance in nutrient-limited, extracellular host environments, providing a foundation for future research into its unique bioenergetics. IMPORTANCEThis study advances our understanding ofTreponema pallidum, the syphilis-causing pathogen, through the reconstruction of iTP251, the first genome-scale metabolic model for this organism, and its enzyme-constrained version, ec-iTP251. The work addresses the challenges of studyingT. pallidum, an extracellular, host-adapted pathogen, due to its strict dependence on host-derived nutrients and challenges inin vitrocultivation. Validated with strong agreement to proteomics data, the model demonstrates high predictive reliability. Key insights include unique metabolic adaptations such as lactate uptake for ATP production and alternative redox-balancing mechanisms. These findings provide a robust framework for future studies aimed at unraveling the pathogen's survival strategies and identifying potential metabolic vulnerabilities. 
    more » « less
  5. Shank, Elizabeth Anne (Ed.)
    ABSTRACT Although bacteria exist in complex microbial communities in the environment, their features and behavior are most often studied in monoculture. While environmental enrichments or complex co-cultures with tens or hundreds of members might more accurately represent the natural communities of bacteria, we sought to create simple pairs of organisms to learn what conditions create successful co-culture and how bacteria change transcriptionally when a partner species is present. We grew two pairs of organisms in co-culture,Pseudomonas aeruginosaandEscherichia coliandLacticaseibacillus rhamnosusandBacteroides thetaiotaomicron. At first, both co-cultures failed, with one organism outcompeting the other. However, through manipulating media and environmental conditions, we created co-cultures with stable member ratios over many generations for each community. We then show that changes in the expression of metabolic genes are present in all studied species, with key catabolic and anabolic pathways often upregulated in the presence of another organism. These changes in gene expression fail to occur in conditions that will not lead to successful co-culture, suggesting they are essential for adapting to and surviving in the presence of others. IMPORTANCEIn 1882, Robert Koch and Fanny Hesse developed the agar plate, which enabled microbiologists to separate individual microbial cells from each other and create monocultures of a single strain of bacteria. This powerful tool has been used in the almost 150 years since to develop a robust understanding of how bacterial cells are structured, how they manage and process their information, and how they respond to the environment to produce behaviors that match their circumstances. We were curious about how the behavior of bacteria, as measured by their gene expression, changes between well-studied monoculture conditions and co-culture. We found that only specific growth conditions permit co-culture and that bacteria change their metabolic strategies in the presence of a partner. 
    more » « less