skip to main content


Title: Microbiome Metadata Standards: Report of the National Microbiome Data Collaborative’s Workshop and Follow-On Activities

Microbiome samples are inherently defined by the environment in which they are found. Therefore, data that provide context and enable interpretation of measurements produced from biological samples, often referred to as metadata, are critical.

 
more » « less
Award ID(s):
1714276
NSF-PAR ID:
10490403
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; « less
Editor(s):
Bucci, Vanni
Publisher / Repository:
American Society for Microbiology
Date Published:
Journal Name:
mSystems
Volume:
6
Issue:
1
ISSN:
2379-5077
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Motivation

    Learning associations of traits with the microbial composition of a set of samples is a fundamental goal in microbiome studies. Recently, machine learning methods have been explored for this goal, with some promise. However, in comparison to other fields, microbiome data are high-dimensional and not abundant; leading to a high-dimensional low-sample-size under-determined system. Moreover, microbiome data are often unbalanced and biased. Given such training data, machine learning methods often fail to perform a classification task with sufficient accuracy. Lack of signal is especially problematic when classes are represented in an unbalanced way in the training data; with some classes under-represented. The presence of inter-correlations among subsets of observations further compounds these issues. As a result, machine learning methods have had only limited success in predicting many traits from microbiome. Data augmentation consists of building synthetic samples and adding them to the training data and is a technique that has proved helpful for many machine learning tasks.

    Results

    In this paper, we propose a new data augmentation technique for classifying phenotypes based on the microbiome. Our algorithm, called TADA, uses available data and a statistical generative model to create new samples augmenting existing ones, addressing issues of low-sample-size. In generating new samples, TADA takes into account phylogenetic relationships between microbial species. On two real datasets, we show that adding these synthetic samples to the training set improves the accuracy of downstream classification, especially when the training data have an unbalanced representation of classes.

    Availability and implementation

    TADA is available at https://github.com/tada-alg/TADA.

    Supplementary information

    Supplementary data are available at Bioinformatics online.

     
    more » « less
  2. Abstract

    The study of the primate microbiome is critical in understanding the role of the microbial community in the host organism. To be able to isolate the main factors responsible for the differences observed in microbiomes within and between individuals, confounding factors due to technical variations need to be removed. To determine whether alterations due to preservatives outweigh differences due to factors such as host population, host species, body site, and habitat, we tested three methods (no preservative, 96% ethanol, and RNAlater) for preserving wild chimpanzee (fecal), wild lemur (fecal), wild vervet monkey (rectal, oral, nasal, otic, vaginal, and penile), and captive vervet monkey (rectal) samples. All samples were stored below − 20°C (short term) at the end of the field day and then at − 80°C until DNA extraction. Using 16S rRNA gene sequencing, we show a significant preservative effect on microbiota composition and diversity. Samples stored in ethanol and RNAlater appear to be less different compared with samples not stored in any preservative (none). Our differential analysis revealed significantly higher amounts of Enterococcaceae and Family XI in no preservative samples, Prevotellaceae and Spirochaetaceae in ethanol and RNAlater preserved samples, Oligosphaeraceae in ethanol‐preserved samples, and Defluviitaleaceae in RNAlater preserved samples. While these preservative effects on the microbiome are not large enough to remove or outweigh the differences arising from biological factors (e.g., host species, body site, and habitat differences) they may promote misleading interpretations if they have large enough effect sizes compared to the biological factors (e.g., host population).

     
    more » « less
  3. Abstract Background

    The Spacecraft Assembly Facility (SAF) at the NASA’s Jet Propulsion Laboratory is the primary cleanroom facility used in the construction of some of the planetary protection (PP)-sensitive missions developed by NASA, including the Mars 2020 Perseverance Rover that launched in July 2020. SAF floor samples (n=98) were collected, over a 6-month period in 2016 prior to the construction of the Mars rover subsystems, to better understand the temporal and spatial distribution of bacterial populations (total, viable, cultivable, and spore) in this unique cleanroom.

    Results

    Cleanroom samples were examined for total (living and dead) and viable (living only) microbial populations using molecular approaches and cultured isolates employing the traditional NASA standard spore assay (NSA), which predominantly isolated spores. The 130 NSA isolates were represented by 16 bacterial genera, of which 97% were identified as spore-formers via Sanger sequencing. The most spatially abundant isolate wasBacillus subtilis, and the most temporally abundant spore-former wasVirgibacillus panthothenticus. The 16S rRNA gene-targeted amplicon sequencing detected 51 additional genera not found in the NSA method. The amplicon sequencing of the samples treated with propidium monoazide (PMA), which would differentiate between viable and dead organisms, revealed a total of 54 genera: 46 viable non-spore forming genera and 8 viable spore forming genera in these samples. The microbial diversity generated by the amplicon sequencing corresponded to ~86% non-spore-formers and ~14% spore-formers. The most common spatially distributed genera wereSphinigobium,Geobacillus, andBacilluswhereas temporally distributed common genera wereAcinetobacter,Geobacilllus, andBacillus. Single-cell genomics detected 6 genera in the sample analyzed, with the most prominent beingAcinetobacter.

    Conclusion

    This study clearly established that detecting spores via NSA does not provide a complete assessment for the cleanliness of spacecraft-associated environments since it failed to detect several PP-relevant genera that were only recovered via molecular methods. This highlights the importance of a methodological paradigm shift to appropriately monitor bioburden in cleanrooms for not only the aeronautical industry but also for pharmaceutical, medical industries, etc., and the need to employ molecular sequencing to complement traditional culture-based assays.

     
    more » « less
  4. Abstract Background

    Empirical field studies allow us to view how ecological and environmental processes shape the biodiversity of our planet, but collecting samples in situ creates inherent challenges. The majority of empirical vertebrate gut microbiome research compares multiple host species against abiotic and biotic factors, increasing the potential for confounding environmental variables. To minimize these confounding factors, we focus on a single species of passerine bird found throughout the geologically complex island of Sulawesi, Indonesia. We assessed the effects of two environmental factors, geographic Areas of Endemism (AOEs) and elevation, as well as host sex on the gut microbiota assemblages of the Sulawesi Babbler,Pellorneum celebense,from three different mountains across the island. Using cloacal swabs, high-throughput-amplicon sequencing, and multiple statistical models, we identified the core microbiome and determined the signal of these three factors on microbial composition.

    Results

    The five most prevalent bacterial phyla within the gut microbiome ofP. celebensewereProteobacteria(32.6%),Actinobacteria(25.2%),Firmicutes(22.1%),Bacteroidetes(8.7%), andPlantomycetes(2.6%). These results are similar to those identified in prior studies of passeriform microbiomes. Overall, microbiota diversity decreased as elevation increased, irrespective of sex or AOE. A single ASV ofClostridiumwas enriched in higher elevation samples, while lower elevation samples were enriched with the generaPerlucidibaca(FamilyMoraxellaceae),Lachnoclostridium(FamilyLachnospiraceae), and an unidentified species in the FamilyPseudonocardiaceae.

    Conclusions

    While the core microbiota families recovered here are consistent with other passerine studies, the decreases in diversity as elevation increases has only been seen in non-avian hosts. Additionally, the increased abundance ofClostridiumat high elevations suggests a potential microbial response to lower oxygen levels. This study emphasizes the importance of incorporating multiple statistical models and abiotic factors such as elevation in empirical microbiome research, and is the first to describe an avian gut microbiome from the island of Sulawesi.

     
    more » « less
  5. Background Though the gut microbiome has been associated with efficacy of immunotherapy (ICI) in certain cancers, similar findings have not been identified for microbiomes from other body sites and their correlation to treatment response and immune related adverse events (irAEs) in lung cancer (LC) patients receiving ICIs. Methods We designed a prospective cohort study conducted from 2018 to 2020 at a single-center academic institution to assess for correlations between the microbiome in various body sites with treatment response and development of irAEs in LC patients treated with ICIs. Patients must have had measurable disease, ECOG 0–2, and good organ function to be included. Data was collected for analysis from January 2019 to October 2020. Patients with histopathologically confirmed, advanced/metastatic LC planned to undergo immunotherapy-based treatment were enrolled between September 2018 and June 2019. Nasal, buccal and gut microbiome samples were obtained prior to initiation of immunotherapy +/− chemotherapy, at development of adverse events (irAEs), and at improvement of irAEs to grade 1 or less. Results Thirty-seven patients were enrolled, and 34 patients were evaluable for this report. 32 healthy controls (HC) from the same geographic region were included to compare baseline gut microbiota. Compared to HC, LC gut microbiota exhibited significantly lower α-diversity. The gut microbiome of patients who did not suffer irAEs were found to have relative enrichment of Bifidobacterium (p = 0.001) and Desulfovibrio (p = 0.0002). Responders to combined chemoimmunotherapy exhibited increased Clostridiales (p = 0.018) but reduced Rikenellaceae (p = 0.016). In responders to chemoimmunotherapy we also observed enrichment of Finegoldia in nasal microbiome, and increased Megasphaera but reduced Actinobacillus in buccal samples. Longitudinal samples exhibited a trend of α-diversity and certain microbial changes during the development and resolution of irAEs. Conclusions This pilot study identifies significant differences in the gut microbiome between HC and LC patients, and their correlation to treatment response and irAEs in LC. In addition, it suggests potential predictive utility in nasal and buccal microbiomes, warranting further validation with a larger cohort and mechanistic dissection using preclinical models. 
    more » « less