skip to main content


Title: Metagenomic DNA sequencing to quantify Mycobacterium tuberculosis DNA and diagnose tuberculosis
Abstract

Tuberculosis (TB) remains a significant cause of mortality worldwide. Metagenomic next-generation sequencing has the potential to reveal biomarkers of active disease, identify coinfection, and improve detection for sputum-scarce or culture-negative cases. We conducted a large-scale comparative study of 428 plasma, urine, and oral swab samples from 334 individuals from TB endemic and non-endemic regions to evaluate the utility of a shotgun metagenomic DNA sequencing assay for tuberculosis diagnosis. We found that the composition of the control population had a strong impact on the measured performance of the diagnostic test: the use of a control population composed of individuals from a TB non-endemic region led to a test with nearly 100% specificity and sensitivity, whereas a control group composed of individuals from TB endemic regions exhibited a high background of nontuberculous mycobacterial DNA, limiting the diagnostic performance of the test.Using mathematical modeling and quantitative comparisons to matched qPCR data, we found that the burden ofMycobacterium tuberculosisDNA constitutes a very small fraction (0.04 or less) of the total abundance of DNA originating from mycobacteria in samples from TB endemic regions. Our findings suggest that the utility of a minimally invasive metagenomic sequencing assay for pulmonary tuberculosis diagnostics is limited by the low burden ofM. tuberculosisand an overwhelming biological background of nontuberculous mycobacterial DNA.

 
more » « less
NSF-PAR ID:
10373714
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Scientific Reports
Volume:
12
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Background

    Blood-based biomarkers for diagnosing active tuberculosis (TB), monitoring treatment response, and predicting risk of progression to TB disease have been reported. However, validation of the biomarkers across multiple independent cohorts is scarce. A robust platform to validate TB biomarkers in different populations with clinical end points is essential to the development of a point-of-care clinical test. NanoString nCounter technology is an amplification-free digital detection platform that directly measures mRNA transcripts with high specificity. Here, we determined whether NanoString could serve as a platform for extensive validation of candidate TB biomarkers.

    Methods

    The NanoString platform was used for performance evaluation of existing TB gene signatures in a cohort in which signatures were previously evaluated on an RNA-seq dataset. A NanoString codeset that probes 107 genes comprising 12 TB signatures and 6 housekeeping genes (NS-TB107) was developed and applied to total RNA derived from whole blood samples of TB patients and individuals with latent TB infection (LTBI) from South India. The TBSignatureProfiler tool was used to score samples for each signature. An ensemble of machine learning algorithms was used to derive a parsimonious biomarker.

    Results

    Gene signatures present in NS-TB107 had statistically significant discriminative power for segregating TB from LTBI. Further analysis of the data yielded a NanoString 6-gene set (NANO6) that when tested on 10 published datasets was highly diagnostic for active TB.

    Conclusions

    The NanoString nCounter system provides a robust platform for validating existing TB biomarkers and deriving a parsimonious gene signature with enhanced diagnostic performance.

     
    more » « less
  2. Abstract

    Increasing the speed, specificity, sensitivity, and accessibility of mycobacteria detection tools are important challenges for tuberculosis (TB) research and diagnosis. In this regard, previously reported fluorogenic trehalose analogues have shown potential, but their green‐emitting dyes may limit sensitivity and applications in complex settings. Here, we describe a trehalose‐based fluorogenic probe featuring a molecular rotor turn‐on fluorophore with bright far‐red emission (RMR‐Tre). RMR‐Tre, which exploits the unique biosynthetic enzymes and environment of the mycobacterial outer membrane to achieve fluorescence activation, enables fast, no‐wash, low‐background fluorescence detection of live mycobacteria. Aided by the red‐shifted molecular rotor fluorophore, RMR‐Tre exhibited up to a 100‐fold enhancement inM. tuberculosislabeling compared to existing fluorogenic trehalose probes. We show that RMR‐Tre reports onM. tuberculosisdrug resistance in a facile assay, demonstrating its potential as a TB diagnostic tool.

     
    more » « less
  3. Abstract

    Increasing the speed, specificity, sensitivity, and accessibility of mycobacteria detection tools are important challenges for tuberculosis (TB) research and diagnosis. In this regard, previously reported fluorogenic trehalose analogues have shown potential, but their green‐emitting dyes may limit sensitivity and applications in complex settings. Here, we describe a trehalose‐based fluorogenic probe featuring a molecular rotor turn‐on fluorophore with bright far‐red emission (RMR‐Tre). RMR‐Tre, which exploits the unique biosynthetic enzymes and environment of the mycobacterial outer membrane to achieve fluorescence activation, enables fast, no‐wash, low‐background fluorescence detection of live mycobacteria. Aided by the red‐shifted molecular rotor fluorophore, RMR‐Tre exhibited up to a 100‐fold enhancement inM. tuberculosislabeling compared to existing fluorogenic trehalose probes. We show that RMR‐Tre reports onM. tuberculosisdrug resistance in a facile assay, demonstrating its potential as a TB diagnostic tool.

     
    more » « less
  4. Abstract

    Whole‐genome sequencing data allow survey of variation from across the genome, reducing the constraint of balancing genome sub‐sampling with estimating recombination rates and linkage between sampled markers and target loci. As sequencing costs decrease, low‐coverage whole‐genome sequencing of pooled or indexed‐individual samples is commonly utilized to identify loci associated with phenotypes or environmental axes in non‐model organisms. There are, however, relatively few publicly available bioinformatic pipelines designed explicitly to analyse these types of data, and fewer still that process the raw sequencing data, provide useful metrics of quality control and then execute analyses. Here, we present an updated version of a bioinformatics pipeline calledPoolParty2that can effectively handle either pooled or indexed DNA samples and includes new features to improve computational efficiency. Using simulated data, we demonstrate the ability of our pipeline to recover segregating variants, estimate their allele frequencies accurately, and identify genomic regions harbouring loci under selection. Based on the simulated data set, we benchmark the efficacy of our pipeline with another bioinformatic suite,angsd, and illustrate the compatibility and complementarity of these suites usingangsdto generate genotype likelihoods as input for identifying linkage outlier regions using alignment files and variants provided byPoolParty2. Finally, we apply our updated pipeline to an empirical dataset of low‐coverage whole genomic data from population samples of Columbia River steelhead trout (Oncorhynchus mykiss), results from which demonstrate the genomic impacts of decades of artificial selection in a prominent hatchery stock. Thus, we not only demonstrate the utility ofPoolParty2for genomic studies that combine sequencing data from multiple individuals, but also illustrate how it compliments other bioinformatics resources such asangsd.

     
    more » « less
  5. Abstract

    Wild specimens are often collected in challenging field conditions, where samples may be contaminated with the DNA of conspecific individuals. This contamination can result in false genotype calls, which are difficult to detect, but may also cause inaccurate estimates of heterozygosity, allele frequencies and genetic differentiation. Marine broadcast spawners are especially problematic, because population genetic differentiation is low and samples are often collected in bulk and sometimes from active spawning aggregations. Here, we used contaminated and clean Pacific herring (Clupea pallasi) samples to test (a) the efficacy of bleach decontamination, (b) the effect of decontamination on RAD genotypes and (c) the consequences of contaminated samples on population genetic analyses. We collected fin tissue samples from actively spawning (and thus contaminated) wild herring and nonspawning (uncontaminated) herring. Samples were soaked for 10 min in bleach or left untreated, and extracted DNA was used to prepare DNA libraries using a restriction site‐associated DNA (RAD) approach. Our results demonstrate that intraspecific DNA contamination affects patterns of individual and population variability, causes an excess of heterozygotes and biases estimates of population structure. Bleach decontamination was effective at removing intraspecific DNA contamination and compatible with RAD sequencing, producing high‐quality sequences, reproducible genotypes and low levels of missing data. Although sperm contamination may be specific to broadcast spawners, intraspecific contamination of samples may be common and difficult to detect from high‐throughput sequencing data and can impact downstream analyses.

     
    more » « less