skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: mobileOG-db: a Manually Curated Database of Protein Families Mediating the Life Cycle of Bacterial Mobile Genetic Elements
ABSTRACT Bacterial mobile genetic elements (MGEs) encode functional modules that perform both core and accessory functions for the element, the latter of which are often only transiently associated with the element. The presence of these accessory genes, which are often close homologs to primarily immobile genes, incur high rates of false positives and, therefore, limits the usability of these databases for MGE annotation. To overcome this limitation, we analyzed 10,776,849 protein sequences derived from eight MGE databases to compile a comprehensive set of 6,140 manually curated protein families that are linked to the “life cycle” (integration/excision, replication/recombination/repair, transfer, stability/transfer/defense, and phage-specific processes) of plasmids, phages, integrative, transposable, and conjugative elements. We overlay experimental information where available to create a tiered annotation scheme of high-quality annotations and annotations inferred exclusively through bioinformatic evidence. We additionally provide an MGE-class label for each entry (e.g., plasmid or integrative element), and assign to each entry a major and minor category. The resulting database, mobileOG-db (for mobile orthologous groups), comprises over 700,000 deduplicated sequences encompassing five major mobileOG categories and more than 50 minor categories, providing a structured language and interpretable basis for an array of MGE-centered analyses. mobileOG-db can be accessed at mobileogdb.flsi.cloud.vt.edu/, where users can select, refine, and analyze custom subsets of the dynamic mobilome. IMPORTANCE The analysis of bacterial mobile genetic elements (MGEs) in genomic data is a critical step toward profiling the root causes of antibiotic resistance, phenotypic or metabolic diversity, and the evolution of bacterial genera. Existing methods for MGE annotation pose high barriers of biological and computational expertise to properly harness. To bridge this gap, we systematically analyzed 10,776,849 proteins derived from eight databases of MGEs to identify 6,140 MGE protein families that can serve as candidate hallmarks, i.e., proteins that can be used as “signatures” of MGEs to aid annotation. The resulting resource, mobileOG-db, provides a multilevel classification scheme that encompasses plasmid, phage, integrative, and transposable element protein families categorized into five major mobileOG categories and more than 50 minor categories. mobileOG-db thus provides a rich resource for simple and intuitive element annotation that can be integrated seamlessly into existing MGE detection pipelines and colocalization analyses.  more » « less
Award ID(s):
2004751 1545756
PAR ID:
10386319
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Editor(s):
Nojiri, Hideaki
Date Published:
Journal Name:
Applied and Environmental Microbiology
Volume:
88
Issue:
18
ISSN:
0099-2240
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. McMahon, Katherine (Ed.)
    ABSTRACT Mobile genetic elements (MGEs) drive bacterial evolution, alter gene availability within microbial communities, and facilitate adaptation to ecological niches. In natural systems, bacteria simultaneously possess or encounter multiple MGEs, yet their combined influences on microbial communities are poorly understood. Here, we investigate interactions among MGEs in the marine bacterium Sulfitobacter pontiacus . Two related strains, CB-D and CB-A, each harbor a single prophage. These prophages share high sequence identity with one another and an integration site within the host genome, yet these strains exhibit differences in “spontaneous” prophage induction (SPI) and consequent fitness. To better understand mechanisms underlying variation in SPI between these lysogens, we closed their genomes, which revealed that in addition to harboring different prophage genotypes, CB-A lacks two of the four large, low-copy-number plasmids possessed by CB-D. To assess the relative roles of plasmid content versus prophage genotype on host physiology, a panel of derivative strains varying in MGE content were generated. Characterization of these derivatives revealed a robust link between plasmid content and SPI, regardless of prophage genotype. Strains possessing all four plasmids had undetectable phage in cell-free lysates, while strains lacking either one plasmid (pSpoCB-1) or a combination of two plasmids (pSpoCB-2 and pSpoCB-4) produced high (>10 5 PFU/mL) phage titers. Homologous plasmid sequences were identified in related bacteria, and plasmid and phage genes were found to be widespread in Tara Oceans metagenomic data sets. This suggests that plasmid-dependent stabilization of prophages may be commonplace throughout the oceans. IMPORTANCE The consequences of prophage induction on the physiology of microbial populations are varied and include enhanced biofilm formation, conferral of virulence, and increased opportunity for horizontal gene transfer. These traits lead to competitive advantages for lysogenized bacteria and influence bacterial lifestyles in a variety of niches. However, biological controls of “spontaneous” prophage induction, the initiation of phage replication and phage-mediated cell lysis without an overt stressor, are not well understood. In this study, we observed a novel interaction between plasmids and prophages in the marine bacterium Sulfitobacter pontiacus . We found that loss of one or more distinct plasmids—which we show carry genes ubiquitous in the world’s oceans—resulted in a marked increase in prophage induction within lysogenized strains. These results demonstrate cross talk between different mobile genetic elements and have implications for our understanding of the lysogenic-lytic switches of prophages found not only in marine environments, but throughout all ecosystems. 
    more » « less
  2. Abstract BackgroundThere is concern that the microbially rich activated sludge environment of wastewater treatment plants (WWTPs) may contribute to the dissemination of antibiotic resistance genes (ARGs). We applied long-read (nanopore) sequencing to profile ARGs and their neighboring genes to illuminate their fate in the activated sludge treatment by comparing their abundance, genetic locations, mobility potential, and bacterial hosts within activated sludge relative to those in influent sewage across five WWTPs from three continents. ResultsThe abundances (gene copies per Gb of reads, aka gc/Gb) of all ARGs and those carried by putative pathogens decreased 75–90% from influent sewage (192-605 gc/Gb) to activated sludge (31-62 gc/Gb) at all five WWTPs. Long reads enabled quantification of the percent abundance of ARGs with mobility potential (i.e., located on plasmids or co-located with other mobile genetic elements (MGEs)). The abundance of plasmid-associated ARGs decreased at four of five WWTPs (from 40–73 to 31–68%), and ARGs co-located with transposable, integrative, and conjugative element hallmark genes showed similar trends. Most ARG-associated elements decreased 0.35–13.52% while integrative and transposable elements displayed slight increases at two WWTPs (1.4–2.4%). While resistome and taxonomic compositions both shifted significantly, host phyla for chromosomal ARG classes remained relatively consistent, indicating vertical gene transfer via active biomass growth in activated sludge as the key pathway of chromosomal ARG dissemination. ConclusionsOverall, our results suggest that the activated sludge process acted as a barrier against the proliferation of most ARGs, while those that persisted or increased warrant further attention. 
    more » « less
  3. Abstract Mobile genetic elements (MGEs), such as plasmids and bacteriophages, are major contributors to the ecology and evolution of host-associated microbes due to a continuum of symbiotic interactions and by mediating gene flow via horizontal gene transmission. However, while myriad studies have investigated relationships between MGEs and variation in fitness among microbial and eukaryotic hosts, few studies have incorporated this variation into the context of MGE evolution and ecology. Combining HiC-resolved metagenomics with the model honey bee worker gut microbiome, we show that the worker gut contains a dense, nested MGE community that exhibits a wide degree of host range variation among microbial hosts. Using measures of gene similarity and syntenty, we show that plasmids likely mediate gene flow between individual honey bee colonies, though these plasmids exhibit broad host range variation within their individual microbiomes. We further show that phage-microbe networks exhibit high variation among individual metagenomes, and that phages show broad host range with respect to both the number and phylogenetic distance of their hosts. Finally, we provide evidence that measures of nucleotide variation positively correlate with host range in bee-associated phages, and that functional targets of diversifying selection are partitioning differently between broad or narrow host range phages. Our work underscores the variability of MGE x microbial interactions within host-associated microbial communities and highlights the genomic variation associated with MGE host range diversity. 
    more » « less
  4. ABSTRACT Phage-plasmids are unique mobile genetic elements that function as plasmids and temperate phages. While it has been observed that such elements often encode antibiotic resistance genes and defense system genes, little else is known about other functional traits they encode. Further, no study to date has documented their environmental distribution and prevalence. Here, we performed genome sequence mining of public databases of phages and plasmids utilizing a random forest classifier to identify phage-plasmids. We recovered 5,742 unique phage-plasmid genomes from a remarkable array of disparate environments, including human, animal, plant, fungi, soil, sediment, freshwater, wastewater, and saltwater environments. The resulting genomes were used in a comparative sequence analysis, revealing functional traits/accessory genes associated with specific environments. Host-associated elements contained the most defense systems (including CRISPR and anti-CRISPR systems) as well as antibiotic resistance genes, while other environments, such as freshwater and saltwater systems, tended to encode components of various biosynthetic pathways. Interestingly, we identified genes encoding for certain functional traits, including anti-CRISPR systems and specific antibiotic resistance genes, that were enriched in phage-plasmids relative to both plasmids and phages. Our results highlight that phage-plasmids are found across a wide-array of environments and likely play a role in shaping microbial ecology in a multitude of niches. IMPORTANCEPhage-plasmids are a novel, hybrid class of mobile genetic element which retain aspects of both phages and plasmids. However, whether phage-plasmids represent merely a rarity or are instead important players in horizontal gene transfer and other important ecological processes has remained a mystery. Here, we document that these hybrids are encountered across a broad range of distinct environments and encode niche-specific functional traits, including the carriage of antibiotic biosynthesis genes and both CRISPR and anti-CRISPR defense systems. These findings highlight phage-plasmids as an important class of mobile genetic element with diverse roles in multiple distinct ecological niches. 
    more » « less
  5. VITTE, Clémentine (Ed.)
    Structural differences between genomes are a major source of genetic variation that contributes to phenotypic differences. Transposable elements, mobile genetic sequences capable of increasing their copy number and propagating themselves within genomes, can generate structural variation. However, their repetitive nature makes it difficult to characterize fine-scale differences in their presence at specific positions, limiting our understanding of their impact on genome variation. Domesticated maize is a particularly good system for exploring the impact of transposable element proliferation as over 70% of the genome is annotated as transposable elements. High-quality transposable element annotations were recently generated forde novogenome assemblies of 26 diverse inbred maize lines. We generated base-pair resolved pairwise alignments between the B73 maize reference genome and the remaining 25 inbred maize line assemblies. From this data, we classified transposable elements as either shared or polymorphic in a given pairwise comparison. Our analysis uncovered substantial structural variation between lines, representing both simple and complex connections between TEs and structural variants. Putative insertions in SNP depleted regions, which represent recently diverged identity by state blocks, suggest some TE families may still be active. However, our analysis reveals that within these recently diverged genomic regions, deletions of transposable elements likely account for more structural variation events and base pairs than insertions. These deletions are often large structural variants containing multiple transposable elements. Combined, our results highlight how transposable elements contribute to structural variation and demonstrate that deletion events are a major contributor to genomic differences. 
    more » « less