Abstract Robust control over gene translation at arbitrary mRNA targets is an outstanding challenge in microbial synthetic biology. The development of tools that can regulate translation will greatly expand our ability to precisely control genes across the genome. In Escherichia coli, most genes are contained in multi-gene operons, which are subject to polar effects where targeting one gene for repression leads to silencing of other genes in the same operon. These effects pose a challenge for independently regulating individual genes in multi-gene operons. Here, we use CRISPR-dCas13 to address this challenge. We find dCas13-mediated repression exhibits up to 6-fold lower polar effects compared to dCas9. We then show that we can selectively activate single genes in a synthetic multi-gene operon by coupling dCas9 transcriptional activation of an operon with dCas13 translational repression of individual genes within the operon. We also show that dCas13 and dCas9 can be multiplexed for improved biosynthesis of a medically-relevant human milk oligosaccharide. Taken together, our findings suggest that combining transcriptional and translational control can access effects that are difficult to achieve with either mode independently. These combined tools for gene regulation will expand our abilities to precisely engineer bacteria for biotechnology and perform systematic genetic screens.
more »
« less
A pan-CRISPR analysis of mammalian cell specificity identifies ultra-compact sgRNA subsets for genome-scale experiments
Abstract A genetic knockout can be lethal to one human cell type while increasing growth rate in another. This context specificity confounds genetic analysis and prevents reproducible genome engineering. Genome-wide CRISPR compendia across most common human cell lines offer the largest opportunity to understand the biology of cell specificity. The prevailing viewpoint, synthetic lethality, occurs when a genetic alteration creates a unique CRISPR dependency. Here, we use machine learning for an unbiased investigation of cell type specificity. Quantifying model accuracy, we find that most cell type specific phenotypes are predicted by the function of related genes of wild-type sequence, not synthetic lethal relationships. These models then identify unexpected sets of 100-300 genes where reduced CRISPR measurements can produce genome-scale loss-of-function predictions across >18,000 genes. Thus, it is possible to reduce in vitro CRISPR libraries by orders of magnitude—with some information loss—when we remove redundant genes and not redundant sgRNAs.
more »
« less
- Award ID(s):
- 1759860
- PAR ID:
- 10362431
- Publisher / Repository:
- Nature Publishing Group
- Date Published:
- Journal Name:
- Nature Communications
- Volume:
- 13
- Issue:
- 1
- ISSN:
- 2041-1723
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Cis-regulatory elements (CREs) control gene expression, orchestrating tissue identity, developmental timing and stimulus responses, which collectively define the thousands of unique cell types in the body1–3. While there is great potential for strategically incorporating CREs in therapeutic or biotechnology applications that require tissue specificity, there is no guarantee that an optimal CRE for these intended purposes has arisen naturally. Here we present a platform to engineer and validate synthetic CREs capable of driving gene expression with programmed cell-type specificity. We take advantage of innovations in deep neural network modelling of CRE activity across three cell types, efficient in silico optimization and massively parallel reporter assays to design and empirically test thousands of CREs4–8. Through large-scale in vitro validation, we show that synthetic sequences are more effective at driving cell-type-specific expression in three cell lines compared with natural sequences from the human genome and achieve specificity in analogous tissues when tested in vivo. Synthetic sequences exhibit distinct motif vocabulary associated with activity in the on-target cell type and a simultaneous reduction in the activity of off-target cells. Together, we provide a generalizable framework to prospectively engineer CREs from massively parallel reporter assay models and demonstrate the required literacy to write fit-for-purpose regulatory code.more » « less
-
ABSTRACT Phage-plasmids are unique mobile genetic elements that function as plasmids and temperate phages. While it has been observed that such elements often encode antibiotic resistance genes and defense system genes, little else is known about other functional traits they encode. Further, no study to date has documented their environmental distribution and prevalence. Here, we performed genome sequence mining of public databases of phages and plasmids utilizing a random forest classifier to identify phage-plasmids. We recovered 5,742 unique phage-plasmid genomes from a remarkable array of disparate environments, including human, animal, plant, fungi, soil, sediment, freshwater, wastewater, and saltwater environments. The resulting genomes were used in a comparative sequence analysis, revealing functional traits/accessory genes associated with specific environments. Host-associated elements contained the most defense systems (including CRISPR and anti-CRISPR systems) as well as antibiotic resistance genes, while other environments, such as freshwater and saltwater systems, tended to encode components of various biosynthetic pathways. Interestingly, we identified genes encoding for certain functional traits, including anti-CRISPR systems and specific antibiotic resistance genes, that were enriched in phage-plasmids relative to both plasmids and phages. Our results highlight that phage-plasmids are found across a wide-array of environments and likely play a role in shaping microbial ecology in a multitude of niches. IMPORTANCEPhage-plasmids are a novel, hybrid class of mobile genetic element which retain aspects of both phages and plasmids. However, whether phage-plasmids represent merely a rarity or are instead important players in horizontal gene transfer and other important ecological processes has remained a mystery. Here, we document that these hybrids are encountered across a broad range of distinct environments and encode niche-specific functional traits, including the carriage of antibiotic biosynthesis genes and both CRISPR and anti-CRISPR defense systems. These findings highlight phage-plasmids as an important class of mobile genetic element with diverse roles in multiple distinct ecological niches.more » « less
-
Abstract Clustered regularly interspaced short palindromic repeats (CRISPR) screening coupled with single-cell RNA sequencing has emerged as a powerful tool to characterize the effects of genetic perturbations on the whole transcriptome at a single-cell level. However, due to its sparsity and complex structure, analysis of single-cell CRISPR screening data is challenging. In particular, standard differential expression analysis methods are often underpowered to detect genes affected by CRISPR perturbations. We developed a statistical method for such data, called guided sparse factor analysis (GSFA). GSFA infers latent factors that represent coregulated genes or gene modules; by borrowing information from these factors, it infers the effects of genetic perturbations on individual genes. We demonstrated through extensive simulation studies that GSFA detects perturbation effects with much higher power than state-of-the-art methods. Using single-cell CRISPR data from human CD8+T cells and neural progenitor cells, we showed that GSFA identified biologically relevant gene modules and specific genes affected by CRISPR perturbations, many of which were missed by existing methods, providing new insights into the functions of genes involved in T cell activation and neurodevelopment.more » « less
-
INTRODUCTION Genome-wide association studies (GWASs) have identified thousands of human genetic variants associated with diverse diseases and traits, and most of these variants map to noncoding loci with unknown target genes and function. Current approaches to understand which GWAS loci harbor causal variants and to map these noncoding regulators to target genes suffer from low throughput. With newer multiancestry GWASs from individuals of diverse ancestries, there is a pressing and growing need to scale experimental assays to connect GWAS variants with molecular mechanisms. Here, we combined biobank-scale GWASs, massively parallel CRISPR screens, and single-cell sequencing to discover target genes of noncoding variants for blood trait loci with systematic targeting and inhibition of noncoding GWAS loci with single-cell sequencing (STING-seq). RATIONALE Blood traits are highly polygenic, and GWASs have identified thousands of noncoding loci that map to candidate cis -regulatory elements (CREs). By combining CRE-silencing CRISPR perturbations and single-cell readouts, we targeted hundreds of GWAS loci in a single assay, revealing target genes in cis and in trans . For select CREs that regulate target genes, we performed direct variant insertion. Although silencing the CRE can identify the target gene, direct variant insertion can identify magnitude and direction of effect on gene expression for the GWAS variant. In select cases in which the target gene was a transcription factor or microRNA, we also investigated the gene-regulatory networks altered upon CRE perturbation and how these networks differ across blood cell types. RESULTS We inhibited candidate CREs from fine-mapped blood trait GWAS variants (from ~750,000 individual of diverse ancestries) in human erythroid progenitors. In total, we targeted 543 variants (254 loci) mapping to candidate CREs, generating multimodal single-cell data including transcriptome, direct CRISPR gRNA capture, and cell surface proteins. We identified target genes in cis (within 500 kb) for 134 CREs. In most cases, we found that the target gene was the closest gene and that specific enhancer-associated biochemical hallmarks (H3K27ac and accessible chromatin) are essential for CRE function. Using multiple perturbations at the same locus, we were able to distinguished between causal variants from noncausal variants in linkage disequilibrium. For a subset of validated CREs, we also inserted specific GWAS variants using base-editing STING-seq (beeSTING-seq) and quantified the effect size and direction of GWAS variants on gene expression. Given our transcriptome-wide data, we examined dosage effects in cis and trans in cases in which the cis target is a transcription factor or microRNA. We found that trans target genes are also enriched for GWAS loci, and identified gene clusters within trans gene networks with distinct biological functions and expression patterns in primary human blood cells. CONCLUSION In this work, we investigated noncoding GWAS variants at scale, identifying target genes in single cells. These methods can help to address the variant-to-function challenges that are a barrier for translation of GWAS findings (e.g., drug targets for diseases with a genetic basis) and greatly expand our ability to understand mechanisms underlying GWAS loci. Identifying causal variants and their target genes with STING-seq. Uncovering causal variants and their target genes or function are a major challenge for GWASs. STING-seq combines perturbation of noncoding loci with multimodal single-cell sequencing to profile hundreds of GWAS loci in parallel. This approach can identify target genes in cis and trans , measure dosage effects, and decipher gene-regulatory networks.more » « less