skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on December 1, 2026

Title: Worm Perturb-Seq: massively parallel whole-animal RNAi and RNA-seq
Award ID(s):
1846216
PAR ID:
10601292
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Publisher / Repository:
Springer Nature
Date Published:
Journal Name:
Nature Communications
Volume:
16
Issue:
1
ISSN:
2041-1723
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The benefit of integrating batches of genomic data to increase statistical power is often hindered by batch effects, or unwanted variation in data caused by differences in technical factors across batches. It is therefore critical to effectively address batch effects in genomic data to overcome these challenges. Many existing methods for batch effects adjustment assume the data follow a continuous, bell-shaped Gaussian distribution. However in RNA-seq studies the data are typically skewed, over-dispersed counts, so this assumption is not appropriate and may lead to erroneous results. Negative binomial regression models have been used previously to better capture the properties of counts. We developed a batch correction method, ComBat-seq, using a negative binomial regression model that retains the integer nature of count data in RNA-seq studies, making the batch adjusted data compatible with common differential expression software packages that require integer counts. We show in realistic simulations that the ComBat-seq adjusted data results in better statistical power and control of false positives in differential expression compared to data adjusted by the other available methods. We further demonstrated in a real data example that ComBat-seq successfully removes batch effects and recovers the biological signal in the data. 
    more » « less
  2. Abstract Benchmarking single-cell RNA-seq (scRNA-seq) and single-cell Assay for Transposase-Accessible Chromatin using sequencing (scATAC-seq) computational tools demands simulators to generate realistic sequencing reads. However, none of the few read simulators aim to mimic real data. To fill this gap, we introduce scReadSim, a single-cell RNA-seq and ATAC-seq read simulator that allows user-specified ground truths and generates synthetic sequencing reads (in a FASTQ or BAM file) by mimicking real data. At both read-sequence and read-count levels, scReadSim mimics real scRNA-seq and scATAC-seq data. Moreover, scReadSim provides ground truths, including unique molecular identifier (UMI) counts for scRNA-seq and open chromatin regions for scATAC-seq. In particular, scReadSim allows users to design cell-type-specific ground-truth open chromatin regions for scATAC-seq data generation. In benchmark applications of scReadSim, we show that UMI-tools achieves the top accuracy in scRNA-seq UMI deduplication, and HMMRATAC and MACS3 achieve the top performance in scATAC-seq peak calling. 
    more » « less
  3. Abstract The ability to profile transcriptomes and characterize global gene expression changes has been greatly enabled by the development of RNA sequencing technologies (RNA-seq). However, the process of generating sequencing-compatible cDNA libraries from RNA samples can be time-consuming and expensive, especially for bacterial mRNAs which lack poly(A)-tails that are often used to streamline this process for eukaryotic samples. Compared to the increasing throughput and decreasing cost of sequencing, library preparation has had limited advances. Here, we describe bacterial-multiplexed-seq (BaM-seq), an approach that enables simple barcoding of many bacterial RNA samples that decreases the time and cost of library preparation. We also present targeted-bacterial-multiplexed-seq (TBaM-seq) that allows for differential expression analysis of specific gene panels with over 100-fold enrichment in read coverage. In addition, we introduce the concept of transcriptome redistribution based on TBaM-seq that dramatically reduces the required sequencing depth while still allowing for quantification of both highly and lowly abundant transcripts. These methods accurately measure gene expression changes with high technical reproducibility and agreement with gold standard, lower throughput approaches. Together, use of these library preparation protocols allows for fast, affordable generation of sequencing libraries. 
    more » « less
  4. Abstract It is a challenging task to integrate scRNA-seq and scATAC-seq data obtained from different batches. Existing methods tend to use a pre-defined gene activity matrix to convert the scATAC-seq data into scRNA-seq data. The pre-defined gene activity matrix is often of low quality and does not reflect the dataset-specific relationship between the two data modalities. We propose scDART, a deep learning framework that integrates scRNA-seq and scATAC-seq data and learns cross-modalities relationships simultaneously. Specifically, the design of scDART allows it to preserve cell trajectories in continuous cell populations and can be applied to trajectory inference on integrated data. 
    more » « less
  5. null (Ed.)