skip to main content

Search for: All records

Award ID contains: 2038509

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    Throughout the COVID-19 pandemic, massive sequencing and data sharing efforts enabled the real-time surveillance of novel SARS-CoV-2 strains throughout the world, the results of which provided public health officials with actionable information to prevent the spread of the virus. However, with great sequencing comes great computation, and while cloud computing platforms bring high-performance computing directly into the hands of all who seek it, optimal design and configuration of a cloud compute cluster requires significant system administration expertise. We developed ViReflow, a user-friendly viral consensus sequence reconstruction pipeline enabling rapid analysis of viral sequence datasets leveraging Amazon Web Services (AWS) cloud compute resources and the Reflow system. ViReflow was developed specifically in response to the COVID-19 pandemic, but it is general to any viral pathogen. Importantly, when utilized with sufficient compute resources, ViReflow can trim, map, call variants, and call consensus sequences from amplicon sequence data from 1000 SARS-CoV-2 samples at 1000X depth in < 10 min, with no user intervention. ViReflow’s simplicity, flexibility, and scalability make it an ideal tool for viral molecular epidemiological efforts.

  2. Abstract

    Graves’ Disease is the most common organ-specific autoimmune disease and has been linked in small pilot studies to taxonomic markers within the gut microbiome. Important limitations of this work include small sample sizes and low-resolution taxonomic markers. Accordingly, we studied 162 gut microbiomes of mild and severe Graves’ disease (GD) patients and healthy controls. Taxonomic and functional analyses based on metagenome-assembled genomes (MAGs) and MAG-annotated genes, together with predicted metabolic functions and metabolite profiles, revealed a well-defined network of MAGs, genes and clinical indexes separating healthy from GD subjects. A supervised classification model identified a combination of biomarkers including microbial species, MAGs, genes and SNPs, with predictive power superior to models from any single biomarker type (AUC = 0.98). Global, cross-disease multi-cohort analysis of gut microbiomes revealed high specificity of these GD biomarkers, notably discriminating against Parkinson’s Disease, and suggesting that non-invasive stool-based diagnostics will be useful for these diseases.

  3. Greene, Casey S. (Ed.)
    ABSTRACT UniFrac is an important tool in microbiome research that is used for phylogenetically comparing microbiome profiles to one another (beta diversity). Striped UniFrac recently added the ability to split the problem into many independent subproblems, exhibiting nearly linear scaling but suffering from memory contention. Here, we adapt UniFrac to graphics processing units using OpenACC, enabling greater than 1,000× computational improvement, and apply it to 307,237 samples, the largest 16S rRNA V4 uniformly preprocessed microbiome data set analyzed to date. IMPORTANCE UniFrac is an important tool in microbiome research that is used for phylogenetically comparing microbiome profiles to one another. Here, we adapt UniFrac to operate on graphics processing units, enabling a 1,000× computational improvement. To highlight this advance, we perform what may be the largest microbiome analysis to date, applying UniFrac to 307,237 16S rRNA V4 microbiome samples preprocessed with Deblur. These scaling improvements turn UniFrac into a real-time tool for common data sets and unlock new research questions as more microbiome data are collected.
    Free, publicly-accessible full text available June 28, 2023
  4. Free, publicly-accessible full text available April 1, 2023
  5. Monitoring wastewater samples at building-level resolution screens large populations for SARS-CoV-2, prioritizing testing and isolation efforts. Here we perform untargeted metatranscriptomics on virally-enriched wastewater samples from 10 locations on the UC San Diego campus, demonstrating that resulting bacterial taxonomic and functional profiles discriminate SARS-CoV-2 status even without direct detection of viral transcripts. Our proof-of-principle reveals emergent threats through changes in the human microbiome, suggesting new approaches for untargeted wastewater-based epidemiology.
    Free, publicly-accessible full text available February 24, 2023
  6. Microbiome studies have recently transitioned from experimental designs with a few hundred samples to designs spanning tens of thousands of samples. Modern studies such as the Earth Microbiome Project (EMP) afford the statistics crucial for untangling the many factors that influence microbial community composition. Analyzing those data used to require access to a compute cluster, making it both expensive and inconvenient. We show that recent improvements in both hardware and software now allow to compute key bioinformatics tasks on EMP-sized data in minutes using a gaming-class laptop, enabling much faster and broader microbiome science insights.
  7. We introduce Operational Genomic Unit (OGU), a metagenome analysis strategy that directly exploits sequence alignment hits to individual reference genomes as the minimum unit for assessing the diversity of microbial communities and their relevance to environmental factors. This approach is independent from taxonomic classification, granting the possibility of maximal resolution of community composition, and organizes features into an accurate hierarchy using a phylogenomic tree. The outputs are suitable for contemporary analytical protocols for community ecology, differential abundance and supervised learning while supporting phylogenetic methods, such as UniFrac and phylofactorization, that are seldomly applied to shotgun metagenomics despite being prevalent in 16S rRNA gene amplicon studies. As demonstrated in one synthetic and two real-world case studies, the OGU method produces biologically meaningful patterns from microbiome datasets. Such patterns further remain detectable at very low metagenomic sequencing depths. Compared with taxonomic unit-based analyses implemented in currently adopted metagenomics tools, and the analysis of 16S rRNA gene amplicon sequence variants, this method shows superiority in informing biologically relevant insights, including stronger correlation with body environment and host sex on the Human Microbiome Project dataset, and more accurate prediction of human age by the gut microbiomes in the Finnish population. We provide Woltka, amore »bioinformatics tool to implement this method, with full integration with the QIIME 2 package and the Qiita web platform, to facilitate OGU adoption in future metagenomics studies. Importance Shotgun metagenomics is a powerful, yet computationally challenging, technique compared to 16S rRNA gene amplicon sequencing for decoding the composition and structure of microbial communities. However, current analyses of metagenomic data are primarily based on taxonomic classification, which is limited in feature resolution compared to 16S rRNA amplicon sequence variant analysis. To solve these challenges, we introduce Operational Genomic Units (OGUs), which are the individual reference genomes derived from sequence alignment results, without further assigning them taxonomy. The OGU method advances current read-based metagenomics in two dimensions: (i) providing maximal resolution of community composition while (ii) permitting use of phylogeny-aware tools. Our analysis of real-world datasets shows several advantages over currently adopted metagenomic analysis methods and the finest-grained 16S rRNA analysis methods in predicting biological traits. We thus propose the adoption of OGU as standard practice in metagenomic studies.« less
  8. The human microbiota has a close relationship with human disease and it remodels components of the glycocalyx including heparan sulfate (HS). Studies of the severe acute respiratory syndrome coronavirus (SARS-CoV-2) spike protein receptor binding domain suggest that infection requires binding to HS and angiotensin converting enzyme 2 (ACE2) in a codependent manner. Here, we show that commensal host bacterial communities can modify HS and thereby modulate SARS-CoV-2 spike protein binding and that these communities change with host age and sex. Common human-associated commensal bacteria whose genomes encode HS-modifying enzymes were identified. The prevalence of these bacteria and the expression of key microbial glycosidases in bronchoalveolar lavage fluid (BALF) was lower in adult COVID-19 patients than in healthy controls. The presence of HS-modifying bacteria decreased with age in two large survey datasets, FINRISK 2002 and American Gut, revealing one possible mechanism for the observed increase in COVID-19 susceptibility with age. In vitro, bacterial glycosidases from unpurified culture media supernatants fully blocked SARS-CoV-2 spike binding to human H1299 protein lung adenocarcinoma cells. HS-modifying bacteria in human microbial communities may regulate viral adhesion, and loss of these commensals could predispose individuals to infection. Understanding the impact of shifts in microbial community composition andmore »bacterial lyases on SARS-CoV-2 infection may lead to new therapeutics and diagnosis of susceptibility.« less