skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Single-cell massively-parallel multiplexed microbial sequencing (M3-seq) identifies rare bacterial populations and profiles phage infection
Abstract Bacterial populations are highly adaptive. They can respond to stress and survive in shifting environments. How the behaviours of individual bacteria vary during stress, however, is poorly understood. To identify and characterize rare bacterial subpopulations, technologies for single-cell transcriptional profiling have been developed. Existing approaches show some degree of limitation, for example, in terms of number of cells or transcripts that can be profiled. Due in part to these limitations, few conditions have been studied with these tools. Here we develop massively-parallel, multiplexed, microbial sequencing (M3-seq)—a single-cell RNA-sequencing platform for bacteria that pairs combinatorial cell indexing with post hoc rRNA depletion. We show that M3-seq can profile bacterial cells from different species under a range of conditions in single experiments. We then apply M3-seq to hundreds of thousands of cells, revealing rare populations and insights into bet-hedging associated with stress responses and characterizing phage infection.  more » « less
Award ID(s):
1734030
PAR ID:
10451583
Author(s) / Creator(s):
; ; ; ; ; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Nature Microbiology
Volume:
8
Issue:
10
ISSN:
2058-5276
Page Range / eLocation ID:
p. 1846-1862
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Single-cell RNA sequencing (scRNA-seq) is a widely used technique for characterizing individual cells and studying gene expression at the single-cell level. Clustering plays a vital role in grouping similar cells together for various downstream analyses. However, the high sparsity and dimensionality of large scRNA-seq data pose challenges to clustering performance. Although several deep learning-based clustering algorithms have been proposed, most existing clustering methods have limitations in capturing the precise distribution types of the data or fully utilizing the relationships between cells, leaving a considerable scope for improving the clustering performance, particularly in detecting rare cell populations from large scRNA-seq data. We introduce DeepScena, a novel single-cell hierarchical clustering tool that fully incorporates nonlinear dimension reduction, negative binomial-based convolutional autoencoder for data fitting, and a self-supervision model for cell similarity enhancement. In comprehensive evaluation using multiple large-scale scRNA-seq datasets, DeepScena consistently outperformed seven popular clustering tools in terms of accuracy. Notably, DeepScena exhibits high proficiency in identifying rare cell populations within large datasets that contain large numbers of clusters. When applied to scRNA-seq data of multiple myeloma cells, DeepScena successfully identified not only previously labeled large cell types but also subpopulations in CD14 monocytes, T cells and natural killer cells, respectively. 
    more » « less
  2. Abstract MotivationIntegrative analysis of large-scale single-cell data collected from diverse cell populations promises an improved understanding of complex biological systems. While several algorithms have been developed for single-cell RNA-sequencing data integration, many lack the scalability to handle large numbers of datasets and/or millions of cells due to their memory and run time requirements. The few tools that can handle large data do so by reducing the computational burden through strategies such as subsampling of the data or selecting a reference dataset to improve computational efficiency and scalability. Such shortcuts, however, hamper the accuracy of downstream analyses, especially those requiring quantitative gene expression information. ResultsWe present SCEMENT, a SCalablE and Memory-Efficient iNTegration method, to overcome these limitations. Our new parallel algorithm builds upon and extends the linear regression model previously applied in ComBat to an unsupervised sparse matrix setting to enable accurate integration of diverse and large collections of single-cell RNA-sequencing data. Using tens to hundreds of real single-cell RNA-seq datasets, we show that SCEMENT outperforms ComBat as well as FastIntegration and Scanorama in runtime (upto 214× faster) and memory usage (upto 17.5× less). It not only performs batch correction and integration of millions of cells in under 25 min, but also facilitates the discovery of new rare cell types and more robust reconstruction of gene regulatory networks with full quantitative gene expression information. Availability and implementationSource code freely available for download at https://github.com/AluruLab/scement, implemented in C++ and supported on Linux. 
    more » « less
  3. Abstract BackgroundRNA sequencing is a powerful approach to quantify the genome-wide distribution of mRNA molecules in a population to gain deeper understanding of cellular functions and phenotypes. However, unlike eukaryotic cells, mRNA sequencing of bacterial samples is more challenging due to the absence of a poly-A tail that typically enables efficient capture and enrichment of mRNA from the abundant rRNA molecules in a cell. Moreover, bacterial cells frequently contain 100-fold lower quantities of RNA compared to mammalian cells, which further complicates mRNA sequencing from non-cultivable and non-model bacterial species. To overcome these limitations, we report EMBR-seq (Enrichment of mRNA by Blocked rRNA), a method that efficiently depletes 5S, 16S and 23S rRNA using blocking primers to prevent their amplification. ResultsEMBR-seq results in 90% of the sequenced RNA molecules from anE. coliculture deriving from mRNA. We demonstrate that this increased efficiency provides a deeper view of the transcriptome without introducing technical amplification-induced biases. Moreover, compared to recent methods that employ a large array of oligonucleotides to deplete rRNA, EMBR-seq uses a single or a few oligonucleotides per rRNA, thereby making this new technology significantly more cost-effective, especially when applied to varied bacterial species. Finally, compared to existing commercial kits for bacterial rRNA depletion, we show that EMBR-seq can be used to successfully quantify the transcriptome from more than 500-fold lower starting total RNA. ConclusionsEMBR-seq provides an efficient and cost-effective approach to quantify global gene expression profiles from low input bacterial samples. 
    more » « less
  4. Abstract Single-cell RNA sequencing (scRNA-seq) enables dissecting cellular heterogeneity in tissues, resulting in numerous biological discoveries. Various computational methods have been devised to delineate cell types by clustering scRNA-seq data, where clusters are often annotated using prior knowledge of marker genes. In addition to identifying pure cell types, several methods have been developed to identify cells undergoing state transitions, which often rely on prior clustering results. The present computational approaches predominantly investigate the local and first-order structures of scRNA-seq data using graph representations, while scRNA-seq data frequently display complex high-dimensional structures. Here, we introduce scGeom, a tool that exploits the multiscale and multidimensional structures in scRNA-seq data by analyzing the geometry and topology through curvature and persistent homology of both cell and gene networks. We demonstrate the utility of these structural features to reflect biological properties and functions in several applications, where we show that curvatures and topological signatures of cell and gene networks can help indicate transition cells and the differentiation potential of cells. We also illustrate that structural characteristics can improve the classification of cell types. 
    more » « less
  5. Pagliara, Stefano (Ed.)
    ABSTRACT Non-genetic factors can cause significant fluctuations in gene expression levels. Regardless of growing in a stable environment, this fluctuation leads to cell-to-cell variability in an isogenic population. This phenotypic heterogeneity allows a tiny subset of bacterial cells in a population called persister cells to tolerate long-term lethal antibiotic effects by entering into a non-dividing, metabolically repressed state. We occasionally noticed a high variation in persister levels, and to explore this, we tested clonal populations starting from a single cell using a modified Luria-Delbrück fluctuation test. Although we kept the conditions same, the diversity in persistence level among clones was relatively consistent: varying from ~60- to 100- and ~40- to 70-fold for ampicillin and apramycin, respectively. Then, we divided and diluted each clone to observe whether the same clone had comparable persister levels for more than one generation. Replicates had similar persister levels even when clones were divided, diluted by 1:20, and allowed to grow for approximately five generations. This result explicitly shows a cellular memory passed on for generations and eventually lost when cells are diluted to 1:100 and regrown (>seven generations). Our result demonstrates (1) the existence of a small population prepared for stress (“primed cells”) resulting in higher persister numbers; (2) the primed memory state is reproducible and transient, passed down for generations but eventually lost; and (3) a heterogeneous persister population is a result of a transiently primed reversible cell state and not due to a pre-existing genetic mutation. IMPORTANCEAntibiotics have been highly effective in treating lethal infectious diseases for almost a century. However, the increasing threat of antibiotic resistance is again causing these diseases to become life-threatening. The longer a bacteria can survive antibiotics, the more likely it is to develop resistance. Complicating matters is that non-genetic factors can allow bacterial cells with identical DNA to gain transient resistance (also known as persistence). Here, we show that a small fraction of the bacterial population called primed cells can pass down non-genetic information (“memory”) to their offspring, enabling them to survive lethal antibiotics for a long time. However, this memory is eventually lost. These results demonstrate how bacteria can leverage differences among genetically identical cells formed through non-genetic factors to form primed cells with a selective advantage to survive antibiotics. 
    more » « less