Reverse transcriptases (RTs) are found in different systems including group II introns, Diversity Generating Retroelements (DGRs), retrons, CRISPR-Cas systems, and Abortive Infection (Abi) systems in prokaryotes. Different classes of RTs can play different roles, such as template switching and mobility in group II introns, spacer acquisition in CRISPR-Cas systems, mutagenic retrohoming in DGRs, programmed cell suicide in Abi systems, and recently discovered phage defense in retrons. While some classes of RTs have been studied extensively, others remain to be characterized. There is a lack of computational tools for identifying and characterizing various classes of RTs. In this study, we built a tool (called myRT) for identification and classification of prokaryotic RTs. In addition, our tool provides information about the genomic neighborhood of each RT, providing potential functional clues. We applied our tool to predict RTs in all complete and draft bacterial genomes, and created a collection that can be used for exploration of putative RTs and their associated protein domains. Application of myRT to metagenomes showed that gut metagenomes encode proportionally more RTs related to DGRs, outnumbering retron-related RTs, as compared to the collection of reference genomes. MyRT is both available as a standalone software (https://github.com/mgtools/myRT) and also throughmore »
Changes in the sequence of an organism’s genome, i.e., mutations, are the raw material of evolution. The frequency and location of mutations can be constrained by specific molecular mechanisms, such as diversity-generating retroelements (DGRs). DGRs have been characterized from cultivated bacteria and bacteriophages, and perform error-prone reverse transcription leading to mutations being introduced in specific target genes. DGR loci were also identified in several metagenomes, but the ecological roles and evolutionary drivers of these DGRs remain poorly understood. Here, we analyze a dataset of >30,000 DGRs from public metagenomes, establish six major lineages of DGRs including three primarily encoded by phages and seemingly used to diversify host attachment proteins, and demonstrate that DGRs are broadly active and responsible for >10% of all amino acid changes in some organisms. Overall, these results highlight the constraints under which DGRs evolve, and elucidate several distinct roles these elements play in natural communities.
- Award ID(s):
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- Nature Communications
- Nature Publishing Group
- Sponsoring Org:
- National Science Foundation
More Like this
Metagenomics has transformed our understanding of microbial diversity across ecosystems, with recent advances enabling
de novoassembly of genomes from metagenomes. These metagenome-assembled genomes are critical to provide ecological, evolutionary, and metabolic context for all the microbes and viruses yet to be cultivated. Metagenomes can now be generated from nanogram to subnanogram amounts of DNA. However, these libraries require several rounds of PCR amplification before sequencing, and recent data suggest these typically yield smaller and more fragmented assemblies than regular metagenomes. Methods
Here we evaluate
de novoassembly methods of 169 PCR-amplified metagenomes, including 25 for which an unamplified counterpart is available, to optimize specific assembly approaches for PCR-amplified libraries. We first evaluated coverage bias by mapping reads from PCR-amplified metagenomes onto reference contigs obtained from unamplified metagenomes of the same samples. Then, we compared different assembly pipelines in terms of assembly size (number of bp in contigs ≥ 10 kb) and error rates to evaluate which are the best suited for PCR-amplified metagenomes. Results
Read mapping analyses revealed that the depth of coverage within individual genomes is significantly more uneven in PCR-amplified datasets versus unamplified metagenomes, with regions of high depth of coverage enriched in short inserts. This enrichment scales with the number of PCRmore »
PCR-amplified metagenomes have enabled scientists to explore communities traditionally challenging to describe, including some with extremely low biomass or from which DNA is particularly difficult to extract. Here we show that a modified assembly pipeline can lead to an improved
de novogenome assembly from PCR-amplified datasets, and enables a better genome recovery from low input metagenomes.
VEBA: a modular end-to-end suite for in silico recovery, clustering, and analysis of prokaryotic, microeukaryotic, and viral genomes from metagenomes
With the advent of metagenomics, the importance of microorganisms and how their interactions are relevant to ecosystem resilience, sustainability, and human health has become evident. Cataloging and preserving biodiversity is paramount not only for the Earth’s natural systems but also for discovering solutions to challenges that we face as a growing civilization. Metagenomics pertains to the in silico study of all microorganisms within an ecological community in situ
,however, many software suites recover only prokaryotes and have limited to no support for viruses and eukaryotes. Results
In this study, we introduce the
Viral Eukaryotic Bacterial Archaeal(VEBA) open-source software suite developed to recover genomes from all domains. To our knowledge, VEBAis the first end-to-end metagenomics suite that can directly recover, quality assess, and classify prokaryotic, eukaryotic, and viral genomes from metagenomes. VEBAimplements a novel iterative binning procedure and hybrid sample-specific/multi-sample framework that yields more genomes than any existing methodology alone. VEBAincludes a consensus microeukaryotic database containing proteins from existing databases to optimize microeukaryotic gene modeling and taxonomic classification. VEBAalso provides a unique clustering-based dereplication strategy allowing for sample-specific genomes and genes to be directly compared across non-overlapping biological samples. Finally, VEBAis the only pipeline that automates the detection of candidate phyla radiation bacteria and implements the appropriate genomemore » Conclusions
VEBAsoftware suite allows for the in silico recovery of microorganisms from all domains of life by integrating cutting edge algorithms in novel ways. VEBAfully integrates both end-to-end and task-specific metagenomic analysis in a modular architecture that minimizes dependencies and maximizes productivity. The contributions of VEBAto the metagenomics community includes seamless end-to-end metagenomics analysis but also provides users with the flexibility to perform specific analytical tasks. VEBAallows for the automation of several metagenomics steps and shows that new information can be recovered from existing datasets.
TCA and SSRI Antidepressants Exert Selection Pressure for Efflux-Dependent Antibiotic Resistance Mechanisms in Escherichia coliBonomo, Robert A. (Ed.)ABSTRACT Microbial diversity is reduced in the gut microbiota of animals and humans treated with selective serotonin reuptake inhibitors (SSRIs) and tricyclic antidepressants (TCAs). The mechanisms driving the changes in microbial composition, while largely unknown, is critical to understand considering that the gut microbiota plays important roles in drug metabolism and brain function. Using Escherichia coli , we show that the SSRI fluoxetine and the TCA amitriptyline exert strong selection pressure for enhanced efflux activity of the AcrAB-TolC pump, a member of the resistance-nodulation-cell division (RND) superfamily of transporters. Sequencing spontaneous fluoxetine- and amitriptyline-resistant mutants revealed mutations in marR and lon, negative regulators of AcrAB-TolC expression. In line with the broad specificity of AcrAB-TolC pumps these mutants conferred resistance to several classes of antibiotics. We show that the converse also occurs, as spontaneous chloramphenicol-resistant mutants displayed cross-resistance to SSRIs and TCAs. Chemical-genomic screens identified deletions in marR and lon, confirming the results observed for the spontaneous resistant mutants. In addition, deletions in 35 genes with no known role in drug resistance were identified that conferred cross-resistance to antibiotics and several displayed enhanced efflux activities. These results indicate that combinations of specific antidepressants and antibiotics may have important effects when bothmore »
SSIPe: accurately estimating protein–protein binding affinity change upon mutations using evolutionary profiles in combination with an optimized physical energy function
Most proteins perform their biological functions through interactions with other proteins in cells. Amino acid mutations, especially those occurring at protein interfaces, can change the stability of protein–protein interactions (PPIs) and impact their functions, which may cause various human diseases. Quantitative estimation of the binding affinity changes (ΔΔGbind) caused by mutations can provide critical information for protein function annotation and genetic disease diagnoses.
We present SSIPe, which combines protein interface profiles, collected from structural and sequence homology searches, with a physics-based energy function for accurate ΔΔGbind estimation. To offset the statistical limits of the PPI structure and sequence databases, amino acid-specific pseudocounts were introduced to enhance the profile accuracy. SSIPe was evaluated on large-scale experimental data containing 2204 mutations from 177 proteins, where training and test datasets were stringently separated with the sequence identity between proteins from the two datasets below 30%. The Pearson correlation coefficient between estimated and experimental ΔΔGbind was 0.61 with a root-mean-square-error of 1.93 kcal/mol, which was significantly better than the other methods. Detailed data analyses revealed that the major advantage of SSIPe over other traditional approaches lies in the novel combination of the physical energy function with the new knowledge-based interface profile. SSIPe also considerablymore »
Availability and implementation
Web-server/standalone program, source code and datasets are freely available at https://zhanglab.ccmb.med.umich.edu/SSIPe and https://github.com/tommyhuangthu/SSIPe.
Supplementary data are available at Bioinformatics online.