skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Are the predicted known bacterial strains in a sample really present? A case study
With mutations constantly accumulating in bacterial genomes, it is unclear whether the previously identified bacterial strains are really present in an extant sample. To address this question, we did a case study on the known strains of the bacterial speciesS.aureusandS.epidermisin 68 atopic dermatitis shotgun metagenomic samples. We evaluated the likelihood of the presence of all sixteen known strains predicted in the original study and by two popular tools in this study. We found that even with the same tool, only two known strains were predicted by the original study and this study. Moreover, none of the sixteen known strains was likely present in these 68 samples. Our study thus indicates the limitation of the known-strain-based studies, especially those on rapidly evolving bacterial species. It implies the unlikely presence of the previously identified known strains in a current environmental sample. It also called for de novo bacterial strain identification directly from shotgun metagenomic reads.  more » « less
Award ID(s):
2015838
PAR ID:
10640671
Author(s) / Creator(s):
; ; ;
Editor(s):
El_Allali, Achraf
Publisher / Repository:
Public Library of Science
Date Published:
Journal Name:
PLOS ONE
Volume:
18
Issue:
10
ISSN:
1932-6203
Page Range / eLocation ID:
e0291964
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract MotivationMetagenomic binning aims to retrieve microbial genomes directly from ecosystems by clustering metagenomic contigs assembled from short reads into draft genomic bins. Traditional shotgun-based binning methods depend on the contigs’ composition and abundance profiles and are impaired by the paucity of enough samples to construct reliable co-abundance profiles. When applied to a single sample, shotgun-based binning methods struggle to distinguish closely related species only using composition information. As an alternative binning approach, Hi-C-based binning employs metagenomic Hi-C technique to measure the proximity contacts between metagenomic fragments. However, spurious inter-species Hi-C contacts inevitably generated by incorrect ligations of DNA fragments between species link the contigs from varying genomes, weakening the purity of final draft genomic bins. Therefore, it is imperative to develop a binning pipeline to overcome the shortcomings of both types of binning methods on a single sample. ResultsWe develop HiFine, a novel binning pipeline to refine the binning results of metagenomic contigs by integrating both Hi-C-based and shotgun-based binning tools. HiFine designs a strategy of fragmentation for the original bin sets derived from the Hi-C-based and shotgun-based binning methods, which considerably increases the purity of initial bins, followed by merging fragmented bins and recruiting unbinned contigs. We demonstrate that HiFine significantly improves the existing binning results of both types of binning methods and achieves better performance in constructing species genomes on publicly available datasets. To the best of our knowledge, HiFine is the first pipeline to integrate different types of tools for the binning of metagenomic contigs. Availability and implementationHiFine is available at https://github.com/dyxstat/HiFine. Supplementary informationSupplementary data are available at Bioinformatics online. 
    more » « less
  2. Hayer, Juliette (Ed.)
    Staphylococcus aureus causes both hospital- and community-acquired infections in humans worldwide. Due to the high incidence of infection, S. aureus is also one of the most sampled and sequenced pathogens today, providing an outstanding resource to understand variation at the bacterial subspecies level. We processed and downsampled 83,383 public S. aureus Illumina whole-genome shotgun sequences and 1,263 complete genomes to produce 7,954 representative substrains. Pairwise comparison of average nucleotide identity revealed a natural boundary of 99.5% that could be used to define 145 distinct strains within the species. We found that intermediate frequency genes in the pangenome (present in 10%–95% of genomes) could be divided into those closely linked to strain background (“strain-concentrated”) and those highly variable within strains (“strain-diffuse”). Non-core genes had different patterns of chromosome location. Notably, strain-diffuse genes were associated with prophages; strain-concentrated genes were associated with the vSaβ genome island and rare genes (<10% frequency) concentrated near the origin of replication. Antibiotic resistance genes were enriched in the strain-diffuse class, while virulence genes were distributed between strain-diffuse, strain-concentrated, core, and rare classes. This study shows how different patterns of gene movement help create strains as distinct subspecies entities and provide insight into the diverse histories of important S. aureus functions. 
    more » « less
  3. Abstract The CHAB-I-5 cluster is a pelagic lineage that can comprise a significant proportion of all roseobacters in surface oceans and have predicted roles in biogeochemical cycling via heterotrophy, aerobic anoxygenic photosynthesis (AAnP), CO oxidation, DMSP degradation, and other metabolisms. Though cultures of CHAB-I-5 have been reported, none have been explored and the best known representative, strain SB2, was lost from culture after obtaining the genome sequence. We have isolated two new CHAB-I-5 representatives, strains US3C007 and FZCC0083, and assembled complete, circularized genomes with 98.7% and 92.5% average nucleotide identities with the SB2 genome. Comparison of these three with 49 other unique CHAB-I-5 metagenome-assembled and single-cell genomes indicated that the cluster represents a genus with two species, and we identified subtle differences in genomic content between the two species subclusters. Metagenomic recruitment from over fourteen hundred samples expanded their known global distribution and highlighted both isolated strains as representative members of the clade. FZCC0083 grew over twice as fast as US3C007 and over a wider range of temperatures. The axenic culture of US3C007 occurs as pleomorphic cells with most exhibiting a coccobacillus/vibrioid shape. We propose the nameThalassovivens spotae, gen nov., sp. nov. for the type strain US3C007T
    more » « less
  4. Abstract Shotgun sequencing is routinely employed to study bacteria in microbial communities. With the vast amount of shotgun sequencing reads generated in a metagenomic project, it is crucial to determine the microbial composition at the strain level. This study investigated 20 computational tools that attempt to infer bacterial strain genomes from shotgun reads. For the first time, we discussed the methodology behind these tools. We also systematically evaluated six novel-strain-targeting tools on the same datasets and found that BHap, mixtureS and StrainFinder performed better than other tools. Because the performance of the best tools is still suboptimal, we discussed future directions that may address the limitations. 
    more » « less
  5. Background Despite more than 60 years of research, the etiology of bacterial vaginosis (BV) remains controversial. In this pilot study, we used shotgun metagenomic sequencing to characterize vaginal microbial community changes before the development of incident BV (iBV). Methods A cohort of African American women with a baseline healthy vaginal microbiome (no Amsel criteria, Nugent score 0–3 with no Gardnerella vaginalis morphotypes) were followed for 90 days with daily self-collected vaginal specimens for iBV (≥2 consecutive days of a Nugent score of 7–10). Shotgun metagenomic sequencing was performed on select vaginal specimens from 4 women, every other day for 12 days before iBV diagnosis. Sequencing data were analyzed through Kraken2 and bioBakery 3 workflows, and specimens were classified into community state types. Quantitative polymerase chain reaction was performed to compare the correlation of read counts with bacterial abundance. Results Common BV-associated bacteria such as G. vaginalis , Prevotella bivia , and Fannyhessea vaginae were increasingly identified in the participants before iBV. Linear modeling indicated significant increases in G. vaginalis and F . vaginae relative abundance before iBV, whereas the relative abundance of Lactobacillus species declined over time. The Lactobacillus species decline correlated with the presence of Lactobacillus phages. We observed enrichment in bacterial adhesion factor genes on days before iBV. There were also significant correlations between bacterial read counts and abundances measured by quantitative polymerase chain reaction. Conclusions This pilot study characterizes vaginal community dynamics before iBV and identifies key bacterial taxa and mechanisms potentially involved in the pathogenesis of iBV. 
    more » « less