skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Exponential family measurement error models for single-cell CRISPR screens
Summary CRISPR genome engineering and single-cell RNA sequencing have accelerated biological discovery. Single-cell CRISPR screens unite these two technologies, linking genetic perturbations in individual cells to changes in gene expression and illuminating regulatory networks underlying diseases. Despite their promise, single-cell CRISPR screens present considerable statistical challenges. We demonstrate through theoretical and real data analyses that a standard method for estimation and inference in single-cell CRISPR screens—“thresholded regression”—exhibits attenuation bias and a bias-variance tradeoff as a function of an intrinsic, challenging-to-select tuning parameter. To overcome these difficulties, we introduce GLM-EIV (“GLM-based errors-in-variables”), a new method for single-cell CRISPR screen analysis. GLM-EIV extends the classical errors-in-variables model to responses and noisy predictors that are exponential family-distributed and potentially impacted by the same set of confounding variables. We develop a computational infrastructure to deploy GLM-EIV across hundreds of processors on clouds (e.g. Microsoft Azure) and high-performance clusters. Leveraging this infrastructure, we apply GLM-EIV to analyze two recent, large-scale, single-cell CRISPR screen datasets, yielding several new insights.  more » « less
Award ID(s):
2113072
PAR ID:
10593649
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Biostatistics
Volume:
25
Issue:
4
ISSN:
1465-4644
Format(s):
Medium: X Size: p. 1254-1272
Size(s):
p. 1254-1272
Sponsoring Org:
National Science Foundation
More Like this
  1. The ENCODE Consortium’s efforts to annotate noncoding cis-regulatory elements (CREs) have advanced our understanding of gene regulatory landscapes. Pooled, noncoding CRISPR screens offer a systematic approach to investigate cis-regulatory mechanisms. The ENCODE4 Functional Characterization Centers conducted 108 screens in human cell lines, comprising >540,000 perturbations across 24.85 megabases of the genome. Using 332 functionally confirmed CRE–gene links in K562 cells, we established guidelines for screening endogenous noncoding elements with CRISPR interference (CRISPRi), including accurate detection of CREs that exhibit variable, often low, transcriptional effects. Benchmarking five screen analysis tools, we find that CASA produces the most conservative CRE calls and is robust to artifacts of low-specificity single guide RNAs. We uncover a subtle DNA strand bias for CRISPRi in transcribed regions with implications for screen design and analysis. Together, we provide an accessible data resource, predesigned single guide RNAs for targeting 3,275,697 ENCODE SCREEN candidate CREs with CRISPRi and screening guidelines to accelerate functional characterization of the noncoding genome. 
    more » « less
  2. CRISPR screens are used extensively to systematically interrogate the phenotype-to-genotype problem. In contrast to early CRISPR screens, which defined core cell fitness genes, most current efforts now aim to identify context-specific phenotypes that differentiate a cell line, genetic background or condition of interest, such as a drug treatment. While CRISPR-related technologies have shown great promise and a fast pace of innovation, a better understanding of standards and methods for quality assessment of CRISPR screen results is crucial to guide technology development and application. Specifically, many commonly used metrics for quantifying screen quality do not accurately measure the reproducibility of context-specific hits. We highlight the importance of reporting reproducibility statistics that directly relate to the purpose of the screen and suggest the use of metrics that are sensitive to context-specific signal. 
    more » « less
  3. Abstract In less than a decade, CRISPR screening has revolutionized forward genetics and cell and molecular biology. Advances in screening technologies, including sgRNA libraries, Cas9‐expressing cell lines, and streamlined sequencing pipelines, have democratized pooled CRISPR screens at genome‐wide scale. Initially, many such screens were survival‐based, identifying essential genes in physiological or perturbed processes. With the application of new chemical biology tools to CRISPR screening, the phenotypic space is no longer limited to live/dead selection or screening for levels of conventional fluorescent protein reporters. Further, the resolution has been increased from cell populations to single cells or even the subcellular level. We highlight advances in pooled CRISPR screening, powered by chemical biology, that have expanded phenotypic space, resolution, scope, and scalability as well as strengthened the CRISPR/Cas enzyme toolkit to enable biological hypothesis generation and discovery. 
    more » « less
  4. Abstract CRISPR‐Cas9 screens facilitate the discovery of gene functional relationships and phenotype‐specific dependencies. The Cancer Dependency Map (DepMap) is the largest compendium of whole‐genome CRISPR screens aimed at identifying cancer‐specific genetic dependencies across human cell lines. A mitochondria‐associated bias has been previously reported to mask signals for genes involved in other functions, and thus, methods for normalizing this dominant signal to improve co‐essentiality networks are of interest. In this study, we explore three unsupervised dimensionality reduction methods—autoencoders, robust, and classical principal component analyses (PCA)—for normalizing the DepMap to improve functional networks extracted from these data. We propose a novel “onion” normalization technique to combine several normalized data layers into a single network. Benchmarking analyses reveal that robust PCA combined with onion normalization outperforms existing methods for normalizing the DepMap. Our work demonstrates the value of removing low‐dimensional signals from the DepMap before constructing functional gene networks and provides generalizable dimensionality reduction‐based normalization tools. 
    more » « less
  5. Abstract Clustered regularly interspaced short palindromic repeats (CRISPR)-associated nuclease (Cas) technologies facilitate routine genome engineering of one or a few genes at a time. However, large-scale CRISPR screens with guide RNA libraries remain challenging in plants. Here, we have developed a comprehensive all-in-one CRISPR toolbox for Cas9-based genome editing, cytosine base editing, adenine base editing (ABE), Cas12a-based genome editing and ABE, and CRISPR-Act3.0-based gene activation in both monocot and dicot plants. We evaluated all-in-one T-DNA expression vectors in rice (Oryza sativa, monocot) and tomato (Solanum lycopersicum, dicot) protoplasts, demonstrating their broad and reliable applicability. To showcase the applications of these vectors in CRISPR screens, we constructed guide RNA (gRNA) pools for testing in rice protoplasts, establishing a high-throughput approach to select high-activity gRNAs. Additionally, we demonstrated the efficacy of sgRNA library screening for targeted mutagenesis of ACETOLACTATE SYNTHASE in rice, recovering novel candidate alleles for herbicide resistance. Furthermore, we carried out a CRISPR activation screen in Arabidopsis thaliana, rapidly identifying potent gRNAs for FLOWERING LOCUS T activation that confer an early-flowering phenotype. This toolbox contains 61 versatile all-in-one vectors encompassing nearly all commonly used CRISPR technologies. It will facilitate large-scale genetic screens for loss-of-function or gain-of-function studies, presenting numerous promising applications in plants. 
    more » « less