skip to main content


Title: Simultaneous Profiling of Gene Expression and Chromatin Accessibility in Single Cells
Abstract

Profiling multiple omic layers in a single cell enables the discovery and analysis of biological phenomena that are not apparent from analysis of mono‐omic data. While methods for multiomic profiling have been reported, their adoption has been limited due to high cost and complex workflows. Here, a simple method for joint profiling of gene expression and chromatin accessibility in tens to hundreds of single cells is presented. Assessed herein is the quality of resulting single cell ATAC‐ and RNA‐seq data across three cell types, examining the link between accessibility and expression at theCD3GandFTH1loci in human primary T cells and monocytes, and comparing the accuracy of clustering solutions for mono‐omic and combined data. The new method allows biological laboratories to perform simultaneous profiling of gene expression and chromatin accessibility using standard reagents and instrumentation. This technique, in conjunction with other advances in multiomic profiling, will enable highly resolved cell state classification and more specific mechanistic hypothesis generation than is possible with mono‐omic analysis.

 
more » « less
NSF-PAR ID:
10459357
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Advanced Biosystems
Volume:
3
Issue:
11
ISSN:
2366-7478
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Motivation

    Gene regulatory networks define regulatory relationships between transcription factors and target genes within a biological system, and reconstructing them is essential for understanding cellular growth and function. Methods for inferring and reconstructing networks from genomics data have evolved rapidly over the last decade in response to advances in sequencing technology and machine learning. The scale of data collection has increased dramatically; the largest genome-wide gene expression datasets have grown from thousands of measurements to millions of single cells, and new technologies are on the horizon to increase to tens of millions of cells and above.

    Results

    In this work, we present the Inferelator 3.0, which has been significantly updated to integrate data from distinct cell types to learn context-specific regulatory networks and aggregate them into a shared regulatory network, while retaining the functionality of the previous versions. The Inferelator is able to integrate the largest single-cell datasets and learn cell-type-specific gene regulatory networks. Compared to other network inference methods, the Inferelator learns new and informative Saccharomyces cerevisiae networks from single-cell gene expression data, measured by recovery of a known gold standard. We demonstrate its scaling capabilities by learning networks for multiple distinct neuronal and glial cell types in the developing Mus musculus brain at E18 from a large (1.3 million) single-cell gene expression dataset with paired single-cell chromatin accessibility data.

    Availability and implementation

    The inferelator software is available on GitHub (https://github.com/flatironinstitute/inferelator) under the MIT license and has been released as python packages with associated documentation (https://inferelator.readthedocs.io/).

    Supplementary information

    Supplementary data are available at Bioinformatics online.

     
    more » « less
  2. Dubrovsky, Joseph (Ed.)
    Abstract

    A fundamental question in developmental biology is how the progeny of stem cells become differentiated tissues. The Arabidopsis root is a tractable model to address this question due to its simple organization and defined cell lineages. In particular, the zone of dividing cells at the root tip—the root apical meristem—presents an opportunity to map the gene regulatory networks underlying stem cell niche maintenance, tissue patterning, and cell identity acquisition. To identify molecular regulators of these processes, studies over the last 20 years employed global profiling of gene expression patterns. However, these technologies are prone to information loss due to averaging gene expression signatures over multiple cell types and/or developmental stages. Recently developed high-throughput methods to profile gene expression at single-cell resolution have been successfully applied to plants. Here, we review insights from the first published single-cell mRNA sequencing and chromatin accessibility datasets generated from Arabidopsis roots. These studies successfully reconstruct developmental trajectories, phenotype cell identity mutants at unprecedented resolution, and reveal cell type-specific responses to environmental stimuli. The experimental insight gained from Arabidopsis paves the way to profile roots from additional species.

     
    more » « less
  3. Abstract

    Single-cell technologies characterize complex cell populations across multiple data modalities at unprecedented scale and resolution. Multi-omic data for single cell gene expression, in situ hybridization, or single cell chromatin states are increasingly available across diverse tissue types. When isolating specific cell types from a sample of disassociated cells or performing in situ sequencing in collections of heterogeneous cells, one challenging task is to select a small set of informative markers that robustly enable the identification and discrimination of specific cell types or cell states as precisely as possible. Given single cell RNA-seq data and a set of cellular labels to discriminate, scGeneFit selects gene markers that jointly optimize cell label recovery using label-aware compressive classification methods. This results in a substantially more robust and less redundant set of markers than existing methods, most of which identify markers that separate each cell label from the rest. When applied to a data set given a hierarchy of cell types as labels, the markers found by our method improves the recovery of the cell type hierarchy with fewer markers than existing methods using a computationally efficient and principled optimization.

     
    more » « less
  4. Abstract

    Genome-wide profiling of chromatin accessibility by DNase-seq or ATAC-seq has been widely used to identify regulatory DNA elements and transcription factor binding sites. However, enzymatic DNA cleavage exhibits intrinsic sequence biases that confound chromatin accessibility profiling data analysis. Existing computational tools are limited in their ability to account for such intrinsic biases and not designed for analyzing single-cell data. Here, we present Simplex Encoded Linear Model for Accessible Chromatin (SELMA), a computational method for systematic estimation of intrinsic cleavage biases from genomic chromatin accessibility profiling data. We demonstrate that SELMA yields accurate and robust bias estimation from both bulk and single-cell DNase-seq and ATAC-seq data. SELMA can utilize internal mitochondrial DNA data to improve bias estimation. We show that transcription factor binding inference from DNase footprints can be improved by incorporating estimated biases using SELMA. Furthermore, we show strong effects of intrinsic biases in single-cell ATAC-seq data, and develop the first single-cell ATAC-seq intrinsic bias correction model to improve cell clustering. SELMA can enhance the performance of existing bioinformatics tools and improve the analysis of both bulk and single-cell chromatin accessibility sequencing data.

     
    more » « less
  5. Abstract

    Insulin-producing β cells created from human pluripotent stem cells have potential as a therapy for insulin-dependent diabetes, but human pluripotent stem cell-derived islets (SC-islets) still differ from their in vivo counterparts. To better understand the state of cell types within SC-islets and identify lineage specification deficiencies, we used single-nucleus multi-omic sequencing to analyse chromatin accessibility and transcriptional profiles of SC-islets and primary human islets. Here we provide an analysis that enabled the derivation of gene lists and activity for identifying each SC-islet cell type compared with primary islets. Within SC-islets, we found that the difference between β cells and awry enterochromaffin-like cells is a gradient of cell states rather than a stark difference in identity. Furthermore, transplantation of SC-islets in vivo improved cellular identities overtime, while long-term in vitro culture did not. Collectively, our results highlight the importance of chromatin and transcriptional landscapes during islet cell specification and maturation.

     
    more » « less