Title: Improved detection of disease-associated gut microbes using 16S sequence-based biomarkers
AbstractBackground
Sequencing partial 16S rRNA genes is a cost effective method for quantifying the microbial composition of an environment, such as the human gut. However, downstream analysis relies on binning reads into microbial groups by either considering each unique sequence as a different microbe, querying a database to get taxonomic labels from sequences, or clustering similar sequences together. However, these approaches do not fully capture evolutionary relationships between microbes, limiting the ability to identify differentially abundant groups of microbes between a diseased and control cohort. We present sequence-based biomarkers (SBBs), an aggregation method that groups and aggregates microbes using single variants and combinations of variants within their 16S sequences. We compare SBBs against other existing aggregation methods (OTU clustering andMicrophenoorDiTaxafeatures) in several benchmarking tasks: biomarker discovery via permutation test, biomarker discovery via linear discriminant analysis, and phenotype prediction power. We demonstrate the SBBs perform on-par or better than the state-of-the-art methods in biomarker discovery and phenotype prediction.
Results
On two independent datasets, SBBs identify differentially abundant groups of microbes with similar or higher statistical significance than existing methods in both a permutation-test-based analysis and using linear discriminant analysis effect size. . By grouping microbes by SBB, we can identify several differentially abundant microbial groups (FDR <.1) between children with autism and neurotypical controls in a set of 115 discordant siblings.Porphyromonadaceae,Ruminococcaceae, and an unnamed species ofBlastocystiswere significantly enriched in autism, whileVeillonellaceaewas significantly depleted. Likewise, aggregating microbes by SBB on a dataset of obese and lean twins, we find several significantly differentially abundant microbial groups (FDR<.1). We observedMegasphaeraandSutterellaceaehighly enriched in obesity, andPhocaeicolasignificantly depleted. SBBs also perform on bar with or better than existing aggregation methods as features in a phenotype prediction model, predicting the autism phenotype with an ROC-AUC score of .64 and the obesity phenotype with an ROC-AUC score of .84.
Conclusions
SBBs provide a powerful method for aggregating microbes to perform differential abundance analysis as well as phenotype prediction. Our source code can be freely downloaded fromhttp://github.com/briannachrisman/16s_biomarkers.
Differential abundance analysis is an essential and commonly used tool to characterize the difference between microbial communities. However, identifying differentially abundant microbes remains a challenging problem because the observed microbiome data are inherently compositional, excessive sparse, and distorted by experimental bias. Besides these major challenges, the results of differential abundance analysis also depend largely on the choice of analysis unit, adding another practical complexity to this already complicated problem.
Results
In this work, we introduce a new differential abundance test called the MsRDB test, which embeds the sequences into a metric space and integrates a multiscale adaptive strategy for utilizing spatial structure to identify differentially abundant microbes. Compared with existing methods, the MsRDB test can detect differentially abundant microbes at the finest resolution offered by data and provide adequate detection power while being robust to zero counts, compositional effect, and experimental bias in the microbial compositional dataset. Applications to both simulated and real microbial compositional datasets demonstrate the usefulness of the MsRDB test.
Availability and implementation
All analyses can be found under https://github.com/lakerwsl/MsRDB-Manuscript-Code.
Microbiomes are now recognized as the main drivers of ecosystem function ranging from the oceans and soils to humans and bioreactors. However, a grand challenge in microbiome science is to characterize and quantify the chemical currencies of organic matter (i.e., metabolites) that microbes respond to and alter. Critical to this has been the development of Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS), which has drastically increased molecular characterization of complex organic matter samples, but challenges users with hundreds of millions of data points where readily available, user-friendly, and customizable software tools are lacking.
Results
Here, we build on years of analytical experience with diverse sample types to develop MetaboDirect, an open-source, command-line-based pipeline for the analysis (e.g., chemodiversity analysis, multivariate statistics), visualization (e.g., Van Krevelen diagrams, elemental and molecular class composition plots), and presentation of direct injection high-resolution FT-ICR MS data sets after molecular formula assignment has been performed. When compared to other available FT-ICR MS software, MetaboDirect is superior in that it requires a single line of code to launch a fully automated framework for the generation and visualization of a wide range of plots, with minimal coding experience required. Among the tools evaluated, MetaboDirect is also uniquely able to automatically generate biochemical transformation networks (ab initio) based on mass differences (mass difference network-based approach) that provide an experimental assessment of metabolite connections within a given sample or a complex metabolic system, thereby providing important information about the nature of the samples and the set of microbial reactions or pathways that gave rise to them. Finally, for more experienced users, MetaboDirect allows users to customize plots, outputs, and analyses.
Conclusion
Application of MetaboDirect to FT-ICR MS-based metabolomic data sets from a marine phage-bacterial infection experiment and aSphagnumleachate microbiome incubation experiment showcase the exploration capabilities of the pipeline that will enable the research community to evaluate and interpret their data in greater depth and in less time. It will further advance our knowledge of how microbial communities influence and are influenced by the chemical makeup of the surrounding system. The source code and User’s guide of MetaboDirect are freely available through (https://github.com/Coayala/MetaboDirect) and (https://metabodirect.readthedocs.io/en/latest/), respectively.
Frare, Carla; Jenkins, Mackenzie E.; McClure, Kelsey M.; Drew, Kelly L.(
, Journal of Neurochemistry)
Abstract
Hibernation is a seasonal phenomenon characterized by a drop in metabolic rate and body temperature. Adenosine A1receptor agonists promote hibernation in different mammalian species, and the understanding of the mechanism inducing hibernation will inform clinical strategies to manipulate metabolic demand that are fundamental to conditions such as obesity, metabolic syndrome, and therapeutic hypothermia. Adenosine A1receptor agonist‐induced hibernation in Arctic ground squirrels is regulated by an endogenous circannual (seasonal) rhythm. This study aims to identify the neuronal mechanism underlying the seasonal difference in response to the adenosine A1receptor agonist. Arctic ground squirrels were implanted with body temperature transmitters and housed at constant ambient temperature (2°C) and light cycle (4L:20D). We administered CHA (N6‐cyclohexyladenosine), an adenosine A1receptor agonist in euthermic‐summer phenotype and euthermic‐winter phenotype and used cFos and phenotypic immunoreactivity to identify cell groups affected by season and treatment. We observed lower core and subcutaneous temperature in winter animals and CHA produced a hibernation‐like response in winter, but not in summer. cFos‐ir was greater in the median preoptic nucleus and the raphe pallidus in summer after CHA. CHA administration also resulted in enhanced cFos‐ir in the nucleus tractus solitarius and decreased cFos‐ir in the tuberomammillary nucleus in both seasons. In winter, cFos‐ir was greater in the supraoptic nucleus and lower in the raphe pallidus than in summer. The seasonal decrease in the thermogenic response to CHA and the seasonal increase in vasoconstriction, assessed by subcutaneous temperature, reflect the endogenous seasonal modulation of the thermoregulatory systems necessary for CHA‐induced hibernation.
Cover Image for this issue: doi:10.1111/jnc.14528.
Properties of molecules are indicative of their functions and thus are useful in many applications. With the advances of deep-learning methods, computational approaches for predicting molecular properties are gaining increasing momentum. However, there lacks customized and advanced methods and comprehensive tools for this task currently.
Results
Here, we develop a suite of comprehensive machine-learning methods and tools spanning different computational models, molecular representations and loss functions for molecular property prediction and drug discovery. Specifically, we represent molecules as both graphs and sequences. Built on these representations, we develop novel deep models for learning from molecular graphs and sequences. In order to learn effectively from highly imbalanced datasets, we develop advanced loss functions that optimize areas under precision–recall curves (PRCs) and receiver operating characteristic (ROC) curves. Altogether, our work not only serves as a comprehensive tool, but also contributes toward developing novel and advanced graph and sequence-learning methodologies. Results on both online and offline antibiotics discovery and molecular property prediction tasks show that our methods achieve consistent improvements over prior methods. In particular, our methods achieve #1 ranking in terms of both ROC-AUC (area under curve) and PRC-AUC on the AI Cures open challenge for drug discovery related to COVID-19.
Availability and implementation
Our source code is released as part of the MoleculeX library (https://github.com/divelab/MoleculeX) under AdvProp.
Supplementary information
Supplementary data are available at Bioinformatics online.
Polycyclic aromatic hydrocarbons (PAHs) are common toxic and carcinogenic pollutants in marine ecosystems. Despite their prevalence in these habitats, relatively little is known about the natural microflora and biochemical pathways that contribute to their degradation. Approaches to investigate marine microbial PAH degraders often heavily rely on genetic biomarkers, which requires prior knowledge of specific degradative enzymes and genes encoding them. As such, these biomarker-reliant approaches cannot efficiently identify novel degradation pathways or degraders. Here, we screen 18 marine bacterial strains representing the Pseudomonadota, Bacillota, and Bacteroidota phyla for degradation of two model PAHs, pyrene (high molecular weight) and phenanthrene (low molecular weight). Using a qualitative PAH plate screening assay, we determined that 16 of 18 strains show some ability to degrade either or both compounds. Degradative ability was subsequently confirmed with a quantitative high-performance liquid chromatography approach, where an additional strain showed some degradation in liquid culture. Several members of the prominent marineRoseobacteraceaefamily degraded pyrene and phenanthrene with varying efficiency (1.2%–29.6% and 5.2%–52.2%, respectively) over 26 days. Described PAH genetic biomarkers were absent in all PAH degrading strains for which genome sequences are available, suggesting that these strains harbor novel transformation pathways. These results demonstrate the utility of culture-based approaches in expanding the knowledge landscape concerning PAH degradation in marine systems.
IMPORTANCE
Polycyclic aromatic hydrocarbon (PAH) pollution is widespread throughout marine environments and significantly affects native flora and fauna. Investigating microbes responsible for degrading PAHs in these environments provides a greater understanding of natural attenuation in these systems. In addition, the use of culture-based approaches to inform bioinformatic and omics-based approaches is useful in identifying novel mechanisms of PAH degradation that elude genetic biomarker-based investigations. Furthermore, culture-based approaches allow for the study of PAH co-metabolism, which increasingly appears to be a prominent mechanism for PAH degradation in marine microbes.
Chrisman, Brianna S., Paskov, Kelley M., Stockham, Nate, Jung, Jae-Yoon, Varma, Maya, Washington, Peter Y., Tataru, Christine, Iwai, Shoko, DeSantis, Todd Z., David, Maude, and Wall, Dennis P.
"Improved detection of disease-associated gut microbes using 16S sequence-based biomarkers". BMC Bioinformatics 22 (1). Country unknown/Code not available: Springer Science + Business Media. https://doi.org/10.1186/s12859-021-04427-7.https://par.nsf.gov/biblio/10305018.
@article{osti_10305018,
place = {Country unknown/Code not available},
title = {Improved detection of disease-associated gut microbes using 16S sequence-based biomarkers},
url = {https://par.nsf.gov/biblio/10305018},
DOI = {10.1186/s12859-021-04427-7},
abstractNote = {Abstract BackgroundSequencing partial 16S rRNA genes is a cost effective method for quantifying the microbial composition of an environment, such as the human gut. However, downstream analysis relies on binning reads into microbial groups by either considering each unique sequence as a different microbe, querying a database to get taxonomic labels from sequences, or clustering similar sequences together. However, these approaches do not fully capture evolutionary relationships between microbes, limiting the ability to identify differentially abundant groups of microbes between a diseased and control cohort. We present sequence-based biomarkers (SBBs), an aggregation method that groups and aggregates microbes using single variants and combinations of variants within their 16S sequences. We compare SBBs against other existing aggregation methods (OTU clustering andMicrophenoorDiTaxafeatures) in several benchmarking tasks: biomarker discovery via permutation test, biomarker discovery via linear discriminant analysis, and phenotype prediction power. We demonstrate the SBBs perform on-par or better than the state-of-the-art methods in biomarker discovery and phenotype prediction. ResultsOn two independent datasets, SBBs identify differentially abundant groups of microbes with similar or higher statistical significance than existing methods in both a permutation-test-based analysis and using linear discriminant analysis effect size. . By grouping microbes by SBB, we can identify several differentially abundant microbial groups (FDR <.1) between children with autism and neurotypical controls in a set of 115 discordant siblings.Porphyromonadaceae,Ruminococcaceae, and an unnamed species ofBlastocystiswere significantly enriched in autism, whileVeillonellaceaewas significantly depleted. Likewise, aggregating microbes by SBB on a dataset of obese and lean twins, we find several significantly differentially abundant microbial groups (FDR<.1). We observedMegasphaeraandSutterellaceaehighly enriched in obesity, andPhocaeicolasignificantly depleted. SBBs also perform on bar with or better than existing aggregation methods as features in a phenotype prediction model, predicting the autism phenotype with an ROC-AUC score of .64 and the obesity phenotype with an ROC-AUC score of .84. ConclusionsSBBs provide a powerful method for aggregating microbes to perform differential abundance analysis as well as phenotype prediction. Our source code can be freely downloaded fromhttp://github.com/briannachrisman/16s_biomarkers.},
journal = {BMC Bioinformatics},
volume = {22},
number = {1},
publisher = {Springer Science + Business Media},
author = {Chrisman, Brianna S. and Paskov, Kelley M. and Stockham, Nate and Jung, Jae-Yoon and Varma, Maya and Washington, Peter Y. and Tataru, Christine and Iwai, Shoko and DeSantis, Todd Z. and David, Maude and Wall, Dennis P.},
}
Warning: Leaving National Science Foundation Website
You are now leaving the National Science Foundation website to go to a non-government website.
Website:
NSF takes no responsibility for and exercises no control over the views expressed or the accuracy of
the information contained on this site. Also be aware that NSF's privacy policy does not apply to this site.