skip to main content

Title: scifi-ATAC-seq: massive-scale single-cell chromatin accessibility sequencing using combinatorial fluidic indexing

Single-cell ATAC-seq has emerged as a powerful approach for revealing candidate cis-regulatory elements genome-wide at cell-type resolution. However, current single-cell methods suffer from limited throughput and high costs. Here, we present a novel technique called scifi-ATAC-seq, single-cell combinatorial fluidic indexing ATAC-sequencing, which combines a barcoded Tn5 pre-indexing step with droplet-based single-cell ATAC-seq using the 10X Genomics platform. With scifi-ATAC-seq, up to 200,000 nuclei across multiple samples can be indexed in a single emulsion reaction, representing an approximately 20-fold increase in throughput compared to the standard 10X Genomics workflow.

more » « less
Award ID(s):
1856627 2026554
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
Genome Biology
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Genome-wide profiling of chromatin accessibility by DNase-seq or ATAC-seq has been widely used to identify regulatory DNA elements and transcription factor binding sites. However, enzymatic DNA cleavage exhibits intrinsic sequence biases that confound chromatin accessibility profiling data analysis. Existing computational tools are limited in their ability to account for such intrinsic biases and not designed for analyzing single-cell data. Here, we present Simplex Encoded Linear Model for Accessible Chromatin (SELMA), a computational method for systematic estimation of intrinsic cleavage biases from genomic chromatin accessibility profiling data. We demonstrate that SELMA yields accurate and robust bias estimation from both bulk and single-cell DNase-seq and ATAC-seq data. SELMA can utilize internal mitochondrial DNA data to improve bias estimation. We show that transcription factor binding inference from DNase footprints can be improved by incorporating estimated biases using SELMA. Furthermore, we show strong effects of intrinsic biases in single-cell ATAC-seq data, and develop the first single-cell ATAC-seq intrinsic bias correction model to improve cell clustering. SELMA can enhance the performance of existing bioinformatics tools and improve the analysis of both bulk and single-cell chromatin accessibility sequencing data.

    more » « less
  2. null (Ed.)
    Abstract Distinguishing biological from technical variation is crucial when integrating and comparing single-cell genomics datasets across different experiments. Existing methods lack the capability in explicitly distinguishing these two variations, often leading to the removal of both variations. Here, we present an integration method scMC to remove the technical variation while preserving the intrinsic biological variation. scMC learns biological variation via variance analysis to subtract technical variation inferred in an unsupervised manner. Application of scMC to both simulated and real datasets from single-cell RNA-seq and ATAC-seq experiments demonstrates its capability of detecting context-shared and context-specific biological signals via accurate alignment. 
    more » « less
  3. Abstract

    Benchmarking single-cell RNA-seq (scRNA-seq) and single-cell Assay for Transposase-Accessible Chromatin using sequencing (scATAC-seq) computational tools demands simulators to generate realistic sequencing reads. However, none of the few read simulators aim to mimic real data. To fill this gap, we introduce scReadSim, a single-cell RNA-seq and ATAC-seq read simulator that allows user-specified ground truths and generates synthetic sequencing reads (in a FASTQ or BAM file) by mimicking real data. At both read-sequence and read-count levels, scReadSim mimics real scRNA-seq and scATAC-seq data. Moreover, scReadSim provides ground truths, including unique molecular identifier (UMI) counts for scRNA-seq and open chromatin regions for scATAC-seq. In particular, scReadSim allows users to design cell-type-specific ground-truth open chromatin regions for scATAC-seq data generation. In benchmark applications of scReadSim, we show that UMI-tools achieves the top accuracy in scRNA-seq UMI deduplication, and HMMRATAC and MACS3 achieve the top performance in scATAC-seq peak calling.

    more » « less
  4. Abstract Background

    The genetic information contained in the genome of an organism is organized in genes and regulatory elements that control gene expression. The genomes of multiple plants species have already been sequenced and the gene repertory have been annotated, however,cis-regulatory elements remain less characterized, limiting our understanding of genome functionality. These elements act as open platforms for recruiting both positive- and negative-acting transcription factors, and as such, chromatin accessibility is an important signature for their identification.


    In this work we developed a transgenic INTACT [isolation of nuclei tagged in specific cell types] system in tetraploid wheat for nuclei purifications. Then, we combined the INTACT system together with the assay for transposase-accessible chromatin with sequencing [ATAC-seq] to identify open chromatin regions in wheat root tip samples. Our ATAC-seq results showed a large enrichment of open chromatin regions in intergenic and promoter regions, which is expected for regulatory elements and that is similar to ATAC-seq results obtained in other plant species. In addition, root ATAC-seq peaks showed a significant overlap with a previously published ATAC-seq data from wheat leaf protoplast, indicating a high reproducibility between the two experiments and a large overlap between open chromatin regions in root and leaf tissues. Importantly, we observed overlap between ATAC-seq peaks andcis-regulatory elements that have been functionally validated in wheat, and a good correlation between normalized accessibility and gene expression levels.


    We have developed and validated an INTACT system in tetraploid wheat that allows rapid and high-quality nuclei purification from root tips. Those nuclei were successfully used to performed ATAC-seq experiments that revealed open chromatin regions in the wheat genome that will be useful to identify cis-regulatory elements. The INTACT system presented here will facilitate the development of ATAC-seq datasets in other tissues, growth stages, and under different growing conditions to generate a more complete landscape of the accessible DNA regions in the wheat genome.

    more » « less
  5. Abstract

    Droplet‐based single cell sequencing technologies, such as inDrop, Drop‐seq, and 10X Genomics, are catalyzing a revolution in the understanding of biology. Barcoding beads are key components for these technologies. What is limiting today are barcoding beads that are easy to fabricate, can efficiently deliver primers into drops, and thus achieve high detection efficiency. Here, this work reports an approach to fabricate dissolvable polyacrylamide beads, by crosslinking acrylamide with disulfide bridges that can be cleaved with dithiothreitol. The beads can be rapidly dissolved in drops and release DNA barcode primers. The dissolvable beads are easy to synthesize, and the primer cost for the beads is significantly lower than that for the previous barcoding beads. Furthermore, the dissolvable beads can be loaded into drops with >95% loading efficiency of a single bead per drop and the dissolution of beads does not influence reverse transcription or the polymerase chain reaction (PCR) in drops. Based on this approach, the dissolvable beads are used for single cell RNA and protein analysis.

    more » « less