skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Detecting patterns of accessory genome coevolution in Staphylococcus aureus using data from thousands of genomes
Abstract Bacterial genomes exhibit widespread horizontal gene transfer, resulting in highly variable genome content that complicates the inference of genetic interactions. In this study, we develop a method for detecting coevolving genes from large datasets of bacterial genomes based on pairwise comparisons of closely related individuals, analogous to a pedigree study in eukaryotic populations. We apply our method to pairs of genes from theStaphylococcus aureusaccessory genome of over 75,000 annotated gene families using a database of over 40,000 whole genomes. We find many pairs of genes that appear to be gained or lost in a coordinated manner, as well as pairs where the gain of one gene is associated with the loss of the other. These pairs form networks of rapidly coevolving genes, primarily consisting of genes involved in virulence, mechanisms of horizontal gene transfer, and antibiotic resistance, particularly the SCCmeccomplex. While we focus on gene gain and loss, our method can also detect genes that tend to acquire substitutions in tandem, or genotype-phenotype or phenotype-phenotype coevolution. Finally, we present the R package that allows for the computation of our method.  more » « less
Award ID(s):
2146260
PAR ID:
10421306
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
BMC Bioinformatics
Volume:
24
Issue:
1
ISSN:
1471-2105
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Motivation: The study of bacterial genome dynamics is vital for understanding the mechanisms underlying microbial adaptation, growth, and their impact on host phenotype. Structural variants (SVs), genomic alterations of 50 base pairs or more, play a pivotal role in driving evolutionary processes and maintaining genomic heterogeneity within bacterial populations. While SV detection in isolate genomes is relatively straightforward, metagenomes present broader challenges due to the absence of clear reference genomes and the presence of mixed strains. In response, our proposed method rhea, forgoes reference genomes and metagenome-assembled genomes (MAGs) by encompassing all metagenomic samples in a series (time or other metric) into a single co-assembly graph. The log fold change in graph coverage between successive samples is then calculated to call SVs that are thriving or declining. Results: We show rhea to outperform existing methods for SV and horizontal gene transfer (HGT) detection in two simulated mock metagenomes, particularly as the simulated reads diverge from reference genomes and an increase in strain diversity is incorporated. We additionally demonstrate use cases for rhea on series metagenomic data of environmental and fermented food microbiomes to detect specific sequence alterations between successive time and temperature samples, suggesting host advantage. Our approach leverages previous work in assembly graph structural and coverage patterns to provide versatility in studying SVs across diverse and poorly characterized microbial communities for more comprehensive insights into microbial gene flux. Availability and implementation: rhea is open source and available at: https://github.com/treangenlab/rhea. 
    more » « less
  2. Rogers, Rebekah (Ed.)
    Abstract Wolbachia are a genus of widespread bacterial endosymbionts in which some strains can hijack or manipulate arthropod host reproduction. Male killing is one such manipulation in which these maternally transmitted bacteria benefit surviving daughters in part by removing competition with the sons for scarce resources. Despite previous findings of interesting genome features of microbial sex ratio distorters, the population genomics of male-killers remain largely uncharacterized. Here, we uncover several unique features of the genome and population genomics of four Arizonan populations of a male-killing Wolbachia strain, wInn, that infects mushroom-feeding Drosophila innubila. We first compared the wInn genome with other closely related Wolbachia genomes of Drosophila hosts in terms of genome content and confirm that the wInn genome is largely similar in overall gene content to the wMel strain infecting D. melanogaster. However, it also contains many unique genes and repetitive genetic elements that indicate lateral gene transfers between wInn and non-Drosophila eukaryotes. We also find that, in line with literature precedent, genes in the Wolbachia prophage and Octomom regions are under positive selection. Of all the genes under positive selection, many also show evidence of recent horizontal transfer among Wolbachia symbiont genomes. These dynamics of selection and horizontal gene transfer across the genomes of several Wolbachia strains and diverse host species may be important underlying factors in Wolbachia’s success as a male-killer of divergent host species. 
    more » « less
  3. Kolodny, Rachel (Ed.)
    Phylogenomic studies of prokaryotic taxa often assume conserved marker genes are homologous across their length. However, processes such as horizontal gene transfer or gene duplication and loss may disrupt this homology by recombining only parts of genes, causing gene fission or fusion. We show using simulation that it is necessary to delineate homology groups in a set of bacterial genomes without relying on gene annotations to define the boundaries of homologous regions. To solve this problem, we have developed a graph-based algorithm to partition a set of bacterial genomes into Maximal Homologous Groups of sequences ( MHGs ) where each MHG is a maximal set of maximum-length sequences which are homologous across the entire sequence alignment. We applied our algorithm to a dataset of 19 Enterobacteriaceae species and found that MHGs cover much greater proportions of genomes than markers and, relatedly, are less biased in terms of the functions of the genes they cover. We zoomed in on the correlation between each individual marker and their overlapping MHGs, and show that few phylogenetic splits supported by the markers are supported by the MHGs while many marker-supported splits are contradicted by the MHGs. A comparison of the species tree inferred from marker genes with the species tree inferred from MHGs suggests that the increased bias and lack of genome coverage by markers causes incorrect inferences as to the overall relationship between bacterial taxa. 
    more » « less
  4. Stelkens, Rike (Ed.)
    Horizontal gene transfer (HGT) is a major contributor to bacterial genome evolution, generating phenotypic diversity, driving the expansion of protein families, and facilitating the evolution of new phenotypes, new metabolic pathways, and new species. Comparative studies of gene gain in bacteria suggest that the frequency with which individual genes successfully undergo HGT varies considerably and may be associated with the number of protein–protein interactions in which the gene participates, that is, its connectivity. Two nonexclusive hypotheses have emerged to explain why transferability should decrease with connectivity: the complexity hypothesis (Jain R, Rivera MC, Lake JA. 1999. Horizontal gene transfer among genomes: the complexity hypothesis. Proc Natl Acad Sci U S A. 96:3801–3806.) and the balance hypothesis (Papp B, Pál C, Hurst LD. 2003. Dosage sensitivity and the evolution of gene families in yeast. Nature 424:194–197.). These hypotheses predict that the functional costs of HGT arise from a failure of divergent homologs to make normal protein–protein interactions or from gene misexpression, respectively. Here we describe genome-wide assessments of these hypotheses in which we used 74 existing prokaryotic whole genome shotgun libraries to estimate rates of horizontal transfer of genes from taxonomically diverse prokaryotic donors into Escherichia coli. We show that 1) transferability declines as connectivity increases, 2) transferability declines as the divergence between donor and recipient orthologs increases, and that 3) the magnitude of this negative effect of divergence on transferability increases with connectivity. These effects are particularly robust among the translational proteins, which span the widest range of connectivities. Whereas the complexity hypothesis explains all three of these observations, the balance hypothesis explains only the first one. 
    more » « less
  5. Abstract Pan-genome analyses of metagenome-assembled genomes (MAGs) may suffer from the known issues with MAGs: fragmentation, incompleteness and contamination. Here, we conducted a critical assessment of pan-genomics of MAGs, by comparing pan-genome analysis results of complete bacterial genomes and simulated MAGs. We found that incompleteness led to significant core gene (CG) loss. The CG loss remained when using different pan-genome analysis tools (Roary, BPGA, Anvi’o) and when using a mixture of MAGs and complete genomes. Contamination had little effect on core genome size (except for Roary due to in its gene clustering issue) but had major influence on accessory genomes. Importantly, the CG loss was partially alleviated by lowering the CG threshold and using gene prediction algorithms that consider fragmented genes, but to a less degree when incompleteness was higher than 5%. The CG loss also led to incorrect pan-genome functional predictions and inaccurate phylogenetic trees. Our main findings were supported by a study of real MAG-isolate genome data. We conclude that lowering CG threshold and predicting genes in metagenome mode (as Anvi’o does with Prodigal) are necessary in pan-genome analysis of MAGs. Development of new pan-genome analysis tools specifically for MAGs are needed in future studies. 
    more » « less