skip to main content


Title: Genome-wide functional screens enable the prediction of high activity CRISPR-Cas9 and -Cas12a guides in Yarrowia lipolytica
Abstract

Genome-wide functional genetic screens have been successful in discovering genotype-phenotype relationships and in engineering new phenotypes. While broadly applied in mammalian cell lines and inE. coli, use in non-conventional microorganisms has been limited, in part, due to the inability to accurately design high activity CRISPR guides in such species. Here, we develop an experimental-computational approach to sgRNA design that is specific to an organism of choice, in this case the oleaginous yeastYarrowia lipolytica. A negative selection screen in the absence of non-homologous end-joining, the dominant DNA repair mechanism, was used to generate single guide RNA (sgRNA) activity profiles for both SpCas9 and LbCas12a. This genome-wide data served as input to a deep learning algorithm, DeepGuide, that is able to accurately predict guide activity. DeepGuide uses unsupervised learning to obtain a compressed representation of the genome, followed by supervised learning to map sgRNA sequence, genomic context, and epigenetic features with guide activity. Experimental validation, both genome-wide and with a subset of selected genes, confirms DeepGuide’s ability to accurately predict high activity sgRNAs. DeepGuide provides an organism specific predictor of CRISPR guide activity that with retraining could be applied to other fungal species, prokaryotes, and other non-conventional organisms.

 
more » « less
NSF-PAR ID:
10363050
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Nature Communications
Volume:
13
Issue:
1
ISSN:
2041-1723
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    High throughput CRISPR screens are revolutionizing the way scientists unravel the genetic underpinnings of engineered and evolved phenotypes. One of the critical challenges in accurately assessing screening outcomes is accounting for the variability in sgRNA cutting efficiency. Poorly active guides targeting genes essential to screening conditions obscure the growth defects that are expected from disrupting them. Here, we develop acCRISPR, an end-to-end pipeline that identifies essential genes in pooled CRISPR screens using sgRNA read counts obtained from next-generation sequencing. acCRISPR uses experimentally determined cutting efficiencies for each guide in the library to provide an activity correction to the screening outcomes via calculation of an optimization metric, thus determining the fitness effect of disrupted genes. CRISPR-Cas9 and -Cas12a screens were carried out in the non-conventional oleaginous yeastYarrowia lipolyticaand acCRISPR was used to determine a high-confidence set of essential genes for growth under glucose, a common carbon source used for the industrial production of oleochemicals. acCRISPR was also used in screens quantifying relative cellular fitness under high salt conditions to identify genes that were related to salt tolerance. Collectively, this work presents an experimental-computational framework for CRISPR-based functional genomics studies that may be expanded to other non-conventional organisms of interest.

     
    more » « less
  2. High throughput CRISPR screens are revolutionizing the way scientists unravel the genetic underpinnings of novel and evolved phenotypes. One of the critical challenges in accurately assessing screening outcomes is accounting for the variability in sgRNA cutting efficiency. Poorly active guides targeting genes essential to screening conditions obscure the growth defects that are expected from disrupting them. Here, we develop acCRISPR, an end-to-end pipeline that identifies essential genes in pooled CRISPR screens using sgRNA read counts obtained from next-generation sequencing. acCRISPR uses experimentally determined cutting efficiencies for each guide in the library to provide an activity correction to the screening outcomes, thus determining the fitness effect of disrupted genes. This is accomplished by calculating an optimization metric that quantifies the tradeoff between guide activity and library coverage, which is maximized to accurately classify genes essential to screening conditions. CRISPR-Cas9 and -Cas12a screens were carried out in the non-conventional oleaginous yeast Yarrowia lipolytica to determine a high-confidence set of essential genes for growth under glucose, a common carbon source used for the industrial production of oleochemicals. acCRISPR was also used in gain-and loss-of-function screens under high salt and low pH conditions to identify known and novel genes that were related to stress tolerance. Collectively, this work presents an experimental-computational framework for CRISPR-based functional genomics studies that may be expanded to other non-conventional organisms of interest. 
    more » « less
  3. Abstract

    CRISPR‐Cas9‐based technologies have revolutionized experimental manipulation of mammalian genomes. However, limitations regarding the delivery and efficacy of these technologies restrict their application in primary cells. This article describes a protocol for penetrant, reproducible, and fast CRISPR‐Cas9 genome editing in cell cultures derived from primary cells. The protocol employs transient nucleofection of ribonucleoprotein complexes composed of chemically synthesized 2′‐O‐methyl‐3′phosphorothioate‐modified single guide RNAs (sgRNAs) and purified Cas9 protein. It can be used both for targeted insertion‐deletion mutation (indel) formation at up to >90% efficiency (via use of a single sgRNA) and for targeted deletion of genomic regions (via combined use of multiple sgRNAs). This article provides examples of the nucleofection buffer and programs that are optimal for patient‐derived glioblastoma (GBM) stem‐like cells (GSCs) and human neural stem/progenitor cells (NSCs), but the protocol can be readily applied to other primary cell cultures by modifying the nucleofection conditions. In summary, this is a relatively simple method that can be used for highly efficient and fast gene knockout, as well as for targeted genomic deletions, even in hyperdiploid cells such as many cancer stem‐like cells. © 2020 Wiley Periodicals LLC

    Basic Protocol: Cas9:sgRNA ribonucleoprotein nucleofection for insertion‐deletion (indel) mutation and genomic deletion generation in primary cell cultures

     
    more » « less
  4. Abstract Background In the CRISPR-Cas9 system, the efficiency of genetic modifications has been found to vary depending on the single guide RNA (sgRNA) used. A variety of sgRNA properties have been found to be predictive of CRISPR cleavage efficiency, including the position-specific sequence composition of sgRNAs, global sgRNA sequence properties, and thermodynamic features. While prevalent existing deep learning-based approaches provide competitive prediction accuracy, a more interpretable model is desirable to help understand how different features may contribute to CRISPR-Cas9 cleavage efficiency. Results We propose a gradient boosting approach, utilizing LightGBM to develop an integrated tool, BoostMEC (Boosting Model for Efficient CRISPR), for the prediction of wild-type CRISPR-Cas9 editing efficiency. We benchmark BoostMEC against 10 popular models on 13 external datasets and show its competitive performance. Conclusions BoostMEC can provide state-of-the-art predictions of CRISPR-Cas9 cleavage efficiency for sgRNA design and selection. Relying on direct and derived sequence features of sgRNA sequences and based on conventional machine learning, BoostMEC maintains an advantage over other state-of-the-art CRISPR efficiency prediction models that are based on deep learning through its ability to produce more interpretable feature insights and predictions. 
    more » « less
  5. Summary

    Cytosine base editors (CBEs) are great additions to the expanding genome editing toolbox. To improve C‐to‐T base editing in plants, we first compared seven cytidine deaminases in the BE3‐like configuration in rice. We found A3A/Y130F‐CBE_V01 resulted in the highest C‐to‐T base editing efficiency in both rice andArabidopsis. Furthermore, we demonstrated this A3A/Y130F cytidine deaminase could be used to improve iSpyMacCas9‐mediated C‐to‐T base editing at A‐rich PAMs. To showcase its applications, we first applied A3A/Y130F‐CBE_V01 for multiplexed editing to generate microRNA‐resistant mRNA transcripts as well as pre‐mature stop codons in multiple seed trait genes. In addition, we harnessed A3A/Y130F‐CBE_V01 for efficient artificial evolution of novelALSandEPSPSalleles which conferred herbicide resistance in rice. To further improve C‐to‐T base editing, multiple CBE_V02, CBE_V03 and CBE_V04 systems were developed and tested in rice protoplasts. The CBE_V04 systems were found to have improved editing activity and purity with focal recruitment of more uracil DNA glycosylase inhibitors (UGIs) by the engineered single guide RNA 2.0 scaffold. Finally, we used whole‐genome sequencing (WGS) to compare six CBE_V01 systems and four CBE_V04 systems for genome‐wide off‐target effects in rice. Different levels of cytidine deaminase‐dependent and sgRNA‐independent off‐target effects were indeed revealed by WGS among edited lines by these CBE systems. We also investigated genome‐wide sgRNA‐dependent off‐target effects by different CBEs in rice. This comprehensive study compared 21 different CBE systems, and benchmarked PmCDA1‐CBE_V04 and A3A/Y130F‐CBE_V04 as next‐generation plant CBEs with high editing efficiency, purity, and specificity.

     
    more » « less