skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Thursday, January 16 until 2:00 AM ET on Friday, January 17 due to maintenance. We apologize for the inconvenience.


Title: Simultaneous Profiling of Gene Expression and Chromatin Accessibility in Single Cells
Abstract

Profiling multiple omic layers in a single cell enables the discovery and analysis of biological phenomena that are not apparent from analysis of mono‐omic data. While methods for multiomic profiling have been reported, their adoption has been limited due to high cost and complex workflows. Here, a simple method for joint profiling of gene expression and chromatin accessibility in tens to hundreds of single cells is presented. Assessed herein is the quality of resulting single cell ATAC‐ and RNA‐seq data across three cell types, examining the link between accessibility and expression at theCD3GandFTH1loci in human primary T cells and monocytes, and comparing the accuracy of clustering solutions for mono‐omic and combined data. The new method allows biological laboratories to perform simultaneous profiling of gene expression and chromatin accessibility using standard reagents and instrumentation. This technique, in conjunction with other advances in multiomic profiling, will enable highly resolved cell state classification and more specific mechanistic hypothesis generation than is possible with mono‐omic analysis.

 
more » « less
PAR ID:
10459357
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Advanced Biosystems
Volume:
3
Issue:
11
ISSN:
2366-7478
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Genome-wide profiling of chromatin accessibility by DNase-seq or ATAC-seq has been widely used to identify regulatory DNA elements and transcription factor binding sites. However, enzymatic DNA cleavage exhibits intrinsic sequence biases that confound chromatin accessibility profiling data analysis. Existing computational tools are limited in their ability to account for such intrinsic biases and not designed for analyzing single-cell data. Here, we present Simplex Encoded Linear Model for Accessible Chromatin (SELMA), a computational method for systematic estimation of intrinsic cleavage biases from genomic chromatin accessibility profiling data. We demonstrate that SELMA yields accurate and robust bias estimation from both bulk and single-cell DNase-seq and ATAC-seq data. SELMA can utilize internal mitochondrial DNA data to improve bias estimation. We show that transcription factor binding inference from DNase footprints can be improved by incorporating estimated biases using SELMA. Furthermore, we show strong effects of intrinsic biases in single-cell ATAC-seq data, and develop the first single-cell ATAC-seq intrinsic bias correction model to improve cell clustering. SELMA can enhance the performance of existing bioinformatics tools and improve the analysis of both bulk and single-cell chromatin accessibility sequencing data.

     
    more » « less
  2. Abstract

    Single-cell technologies enable researchers to investigate cell functions at an individual cell level and study cellular processes with higher resolution. Several multi-omics single-cell sequencing techniques have been developed to explore various aspects of cellular behavior. Using NEAT-seq as an example, this method simultaneously obtains three kinds of omics data for each cell: gene expression, chromatin accessibility, and protein expression of transcription factors (TFs). Consequently, NEAT-seq offers a more comprehensive understanding of cellular activities in multiple modalities. However, there is a lack of tools available for effectively integrating the three types of omics data. To address this gap, we propose a novel pipeline called MultiSC for the analysis of MULTIomic Single-Cell data. Our pipeline leverages a multimodal constraint autoencoder (single-cell hierarchical constraint autoencoder) to integrate the multi-omics data during the clustering process and a matrix factorization–based model (scMF) to predict target genes regulated by a TF. Moreover, we utilize multivariate linear regression models to predict gene regulatory networks from the multi-omics data. Additional functionalities, including differential expression, mediation analysis, and causal inference, are also incorporated into the MultiSC pipeline. Extensive experiments were conducted to evaluate the performance of MultiSC. The results demonstrate that our pipeline enables researchers to gain a comprehensive view of cell activities and gene regulatory networks by fully leveraging the potential of multiomics single-cell data. By employing MultiSC, researchers can effectively integrate and analyze diverse omics data types, enhancing their understanding of cellular processes.

     
    more » « less
  3. Abstract

    Single cell profiling techniques including multi-omics and spatial-omics technologies allow researchers to study cell-cell variation within a cell population. These variations extend to biological networks within cells, in particular, the gene regulatory networks (GRNs). GRNs rewire as the cells evolve, and different cells can have different governing GRNs. However, existing GRN inference methods usually infer a single GRN for a population of cells, without exploring the cell-cell variation in terms of their regulatory mechanisms. Recently, jointly profiled single cell transcriptomics and chromatin accessibility data have been used to infer GRNs. Although methods based on such multi-omics data were shown to improve over the accuracy of methods using only single cell RNA-seq (scRNA-seq) data, they do not take full advantage of the single cell resolution chromatin accessibility data.

    We propose CeSpGRN (CellSpecificGeneRegulatoryNetwork inference), which infers cell-specific GRNs from scRNA-seq, single cell multi-omics, or single cell spatial-omics data. CeSpGRN uses a Gaussian weighted kernel that allows the GRN of a given cell to be learned from the sequencing profile of itself and its neighboring cells in the developmental process. The kernel is constructed from the similarity of gene expressions or spatial locations between cells. When the chromatin accessibility data is available, CeSpGRN constructs cell-specific prior networks which are used to further improve the inference accuracy.

    We applied CeSpGRN to various types of real-world datasets and inferred various regulation changes that were shown to be important in cell development. We also quantitatively measured the performance of CeSpGRN on simulated datasets and compared with baseline methods. The results show that CeSpGRN has a superior performance in reconstructing the GRN for each cell, as well as in detecting the regulatory interactions that differ between cells. CeSpGRN is available athttps://github.com/PeterZZQ/CeSpGRN.

     
    more » « less
  4. Dubrovsky, Joseph (Ed.)
    Abstract

    A fundamental question in developmental biology is how the progeny of stem cells become differentiated tissues. The Arabidopsis root is a tractable model to address this question due to its simple organization and defined cell lineages. In particular, the zone of dividing cells at the root tip—the root apical meristem—presents an opportunity to map the gene regulatory networks underlying stem cell niche maintenance, tissue patterning, and cell identity acquisition. To identify molecular regulators of these processes, studies over the last 20 years employed global profiling of gene expression patterns. However, these technologies are prone to information loss due to averaging gene expression signatures over multiple cell types and/or developmental stages. Recently developed high-throughput methods to profile gene expression at single-cell resolution have been successfully applied to plants. Here, we review insights from the first published single-cell mRNA sequencing and chromatin accessibility datasets generated from Arabidopsis roots. These studies successfully reconstruct developmental trajectories, phenotype cell identity mutants at unprecedented resolution, and reveal cell type-specific responses to environmental stimuli. The experimental insight gained from Arabidopsis paves the way to profile roots from additional species.

     
    more » « less
  5. Abstract

    Insulin-producing β cells created from human pluripotent stem cells have potential as a therapy for insulin-dependent diabetes, but human pluripotent stem cell-derived islets (SC-islets) still differ from their in vivo counterparts. To better understand the state of cell types within SC-islets and identify lineage specification deficiencies, we used single-nucleus multi-omic sequencing to analyse chromatin accessibility and transcriptional profiles of SC-islets and primary human islets. Here we provide an analysis that enabled the derivation of gene lists and activity for identifying each SC-islet cell type compared with primary islets. Within SC-islets, we found that the difference between β cells and awry enterochromaffin-like cells is a gradient of cell states rather than a stark difference in identity. Furthermore, transplantation of SC-islets in vivo improved cellular identities overtime, while long-term in vitro culture did not. Collectively, our results highlight the importance of chromatin and transcriptional landscapes during islet cell specification and maturation.

     
    more » « less