skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Exponential family measurement error models for single-cell CRISPR screens
Summary CRISPR genome engineering and single-cell RNA sequencing have accelerated biological discovery. Single-cell CRISPR screens unite these two technologies, linking genetic perturbations in individual cells to changes in gene expression and illuminating regulatory networks underlying diseases. Despite their promise, single-cell CRISPR screens present considerable statistical challenges. We demonstrate through theoretical and real data analyses that a standard method for estimation and inference in single-cell CRISPR screens—“thresholded regression”—exhibits attenuation bias and a bias-variance tradeoff as a function of an intrinsic, challenging-to-select tuning parameter. To overcome these difficulties, we introduce GLM-EIV (“GLM-based errors-in-variables”), a new method for single-cell CRISPR screen analysis. GLM-EIV extends the classical errors-in-variables model to responses and noisy predictors that are exponential family-distributed and potentially impacted by the same set of confounding variables. We develop a computational infrastructure to deploy GLM-EIV across hundreds of processors on clouds (e.g. Microsoft Azure) and high-performance clusters. Leveraging this infrastructure, we apply GLM-EIV to analyze two recent, large-scale, single-cell CRISPR screen datasets, yielding several new insights.  more » « less
Award ID(s):
2113072
PAR ID:
10536122
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Biostatistics
ISSN:
1465-4644
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The ENCODE Consortium’s efforts to annotate noncoding cis-regulatory elements (CREs) have advanced our understanding of gene regulatory landscapes. Pooled, noncoding CRISPR screens offer a systematic approach to investigate cis-regulatory mechanisms. The ENCODE4 Functional Characterization Centers conducted 108 screens in human cell lines, comprising >540,000 perturbations across 24.85 megabases of the genome. Using 332 functionally confirmed CRE–gene links in K562 cells, we established guidelines for screening endogenous noncoding elements with CRISPR interference (CRISPRi), including accurate detection of CREs that exhibit variable, often low, transcriptional effects. Benchmarking five screen analysis tools, we find that CASA produces the most conservative CRE calls and is robust to artifacts of low-specificity single guide RNAs. We uncover a subtle DNA strand bias for CRISPRi in transcribed regions with implications for screen design and analysis. Together, we provide an accessible data resource, predesigned single guide RNAs for targeting 3,275,697 ENCODE SCREEN candidate CREs with CRISPRi and screening guidelines to accelerate functional characterization of the noncoding genome. 
    more » « less
  2. Abstract CRISPR‐Cas9 screens facilitate the discovery of gene functional relationships and phenotype‐specific dependencies. The Cancer Dependency Map (DepMap) is the largest compendium of whole‐genome CRISPR screens aimed at identifying cancer‐specific genetic dependencies across human cell lines. A mitochondria‐associated bias has been previously reported to mask signals for genes involved in other functions, and thus, methods for normalizing this dominant signal to improve co‐essentiality networks are of interest. In this study, we explore three unsupervised dimensionality reduction methods—autoencoders, robust, and classical principal component analyses (PCA)—for normalizing the DepMap to improve functional networks extracted from these data. We propose a novel “onion” normalization technique to combine several normalized data layers into a single network. Benchmarking analyses reveal that robust PCA combined with onion normalization outperforms existing methods for normalizing the DepMap. Our work demonstrates the value of removing low‐dimensional signals from the DepMap before constructing functional gene networks and provides generalizable dimensionality reduction‐based normalization tools. 
    more » « less
  3. CRISPR screens are used extensively to systematically interrogate the phenotype-to-genotype problem. In contrast to early CRISPR screens, which defined core cell fitness genes, most current efforts now aim to identify context-specific phenotypes that differentiate a cell line, genetic background or condition of interest, such as a drug treatment. While CRISPR-related technologies have shown great promise and a fast pace of innovation, a better understanding of standards and methods for quality assessment of CRISPR screen results is crucial to guide technology development and application. Specifically, many commonly used metrics for quantifying screen quality do not accurately measure the reproducibility of context-specific hits. We highlight the importance of reporting reproducibility statistics that directly relate to the purpose of the screen and suggest the use of metrics that are sensitive to context-specific signal. 
    more » « less
  4. Abstract Immune dysfunction in cancer is enacted by multiple programs, including tumor cell-intrinsic responses to distinct immune subpopulations. A subset of these immune evasion programs can be systematically recapitulated through direct tumor-immune interactionsin vitro. Here, we present an integrated, high-throughput single-cell CRISPR screening framework focused on the protein kinome for mapping the tumor-intrinsic regulation of T cell-driven immune pressure in glioblastoma (GBM). We combine pooled CRISPR interference and activation (CRISPRi/a) with immune-matched NY-ESO-1 antigen-specific allogeneic GBM-T cell co-culture and massively multiplexed single-cell transcriptomics to systematically quantify how genetic perturbation reshapes baseline tumor state and adaptive responses across graded effector-to-target ratios. We further leverage deep generative models for analyzing pooled CRISPR screens to decipher the effects of genetic perturbations on the mechanisms of tumor resistance. This framework resolves distinct modules of immune evasion and survival, including the regulation of the antigen-presentation machinery, interferon/NF-κB signaling, oxidative stress resilience, and checkpoint/cytokine programs, while identifying perturbations that reroute the continuous tumor transcriptional trajectory induced by T cell engagement. A secondary chemical screen in patient-derived GBM cultures identified putative kinase targets of immune evasion phenotypes (e.g., EPHA2 and PDGFRA), whose inhibition leads to the blockade of evasive programs and enhances T cell-mediated GBM killing. Together, this workflow provides a scalable blueprint for comprehensive charting of the genetic control of tumor-immune interactions. 
    more » « less
  5. Abstract Clustered regularly interspaced short palindromic repeats (CRISPR)-associated nuclease (Cas) technologies facilitate routine genome engineering of one or a few genes at a time. However, large-scale CRISPR screens with guide RNA libraries remain challenging in plants. Here, we have developed a comprehensive all-in-one CRISPR toolbox for Cas9-based genome editing, cytosine base editing, adenine base editing (ABE), Cas12a-based genome editing and ABE, and CRISPR-Act3.0-based gene activation in both monocot and dicot plants. We evaluated all-in-one T-DNA expression vectors in rice (Oryza sativa, monocot) and tomato (Solanum lycopersicum, dicot) protoplasts, demonstrating their broad and reliable applicability. To showcase the applications of these vectors in CRISPR screens, we constructed guide RNA (gRNA) pools for testing in rice protoplasts, establishing a high-throughput approach to select high-activity gRNAs. Additionally, we demonstrated the efficacy of sgRNA library screening for targeted mutagenesis of ACETOLACTATE SYNTHASE in rice, recovering novel candidate alleles for herbicide resistance. Furthermore, we carried out a CRISPR activation screen in Arabidopsis thaliana, rapidly identifying potent gRNAs for FLOWERING LOCUS T activation that confer an early-flowering phenotype. This toolbox contains 61 versatile all-in-one vectors encompassing nearly all commonly used CRISPR technologies. It will facilitate large-scale genetic screens for loss-of-function or gain-of-function studies, presenting numerous promising applications in plants. 
    more » « less