skip to main content

Title: Achieving single nucleotide sensitivity in direct hybridization genome imaging

Direct visualization of point mutations in situ can be informative for studying genetic diseases and nuclear biology. We describe a direct hybridization genome imaging method with single-nucleotide sensitivity, single guide genome oligopaint via local denaturation fluorescence in situ hybridization (sgGOLDFISH), which leverages the high cleavage specificity of eSpCas9(1.1) variant combined with a rationally designed guide RNA to load a superhelicase and reveal probe binding sites through local denaturation. The guide RNA carries an intentionally introduced mismatch so that while wild-type target DNA sequence can be efficiently cleaved, a mutant sequence with an additional mismatch (e.g., caused by a point mutation) cannot be cleaved. Because sgGOLDFISH relies on genomic DNA being cleaved by Cas9 to reveal probe binding sites, the probes will only label the wild-type sequence but not the mutant sequence. Therefore, sgGOLDFISH has the sensitivity to differentiate the wild-type and mutant sequences differing by only a single base pair. Using sgGOLDFISH, we identify base-editor-modified and unmodified progeroid fibroblasts from a heterogeneous population, validate the identification through progerin immunofluorescence, and demonstrate accurate sub-nuclear localization of point mutations.

; ; ; ; ; ; ; ;
Publication Date:
Journal Name:
Nature Communications
Nature Publishing Group
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    DNA mismatch repair (MMR), an evolutionarily conserved repair pathway shared by prokaryotic and eukaryotic species alike, influences molecular evolution by detecting and correcting mismatches, thereby protecting genetic fidelity, reducing the mutational load, and preventing lethality. Herein we conduct the first genome-wide evaluation of the alterations to the mutation rate and spectrum under impaired activity of the MutSα homolog, msh-2, in Caenorhabditis elegans male–female fog-2(lf) lines. We performed mutation accumulation (MA) under RNAi-induced knockdown of msh-2 for up to 50 generations, followed by next-generation sequencing of 19 MA lines and the ancestral control. msh-2 impairment in the male–female background substantially increased the frequency of nuclear base substitutions (∼23×) and small indels (∼328×) relative to wildtype hermaphrodites. However, we observed no increase in the mutation rates of mtDNA, and copy-number changes of single-copy genes. There was a marked increase in copy-number variation of rDNA genes under MMR impairment. In C. elegans, msh-2 repairs transitions more efficiently than transversions and increases the AT mutational bias relative to wildtype. The local sequence context, including sequence complexity, G + C-content, and flanking bases influenced the mutation rate. The X chromosome exhibited lower substitution and higher indel rates than autosomes, which can either result from sex-specific mutationmore »rates or a nonrandom distribution of mutable sites between chromosomes. Provided the observed difference in mutational pattern is mostly due to MMR impairment, our results indicate that the specificity of MMR varies between taxa, and is more efficient in detecting and repairing small indels in eukaryotes relative to prokaryotes.

    « less
  2. Abstract

    CRISPR-Cas12a is an RNA-guided, programmable genome editing enzyme found within bacterial adaptive immune pathways. Unlike CRISPR-Cas9, Cas12a uses only a single catalytic site to both cleave target double-stranded DNA (dsDNA) (cis-activity) and indiscriminately degrade single-stranded DNA (ssDNA) (trans-activity). To investigate how the relative potency of cis- versus trans-DNase activity affects Cas12a-mediated genome editing, we first used structure-guided engineering to generate variants of Lachnospiraceae bacterium Cas12a that selectively disrupt trans-activity. The resulting engineered mutant with the biggest differential between cis- and trans-DNase activity in vitro showed minimal genome editing activity in human cells, motivating a second set of experiments using directed evolution to generate additional mutants with robust genome editing activity. Notably, these engineered and evolved mutants had enhanced ability to induce homology-directed repair (HDR) editing by 2–18-fold compared to wild-type Cas12a when using HDR donors containing mismatches with crRNA at the PAM-distal region. Finally, a site-specific reversion mutation produced improved Cas12a (iCas12a) variants with superior genome editing efficiency at genomic sites that are difficult to edit using wild-type Cas12a. This strategy establishes a pipeline for creating improved genome editing tools by combining structural insights with randomization and selection. The available structures of other CRISPR-Cas enzymes will enable this strategymore »to be applied to improve the efficacy of other genome-editing proteins.

    « less
  3. INTRODUCTION Transposable elements (TEs), repeat expansions, and repeat-mediated structural rearrangements play key roles in chromosome structure and species evolution, contribute to human genetic variation, and substantially influence human health through copy number variants, structural variants, insertions, deletions, and alterations to gene transcription and splicing. Despite their formative role in genome stability, repetitive regions have been relegated to gaps and collapsed regions in human genome reference GRCh38 owing to the technological limitations during its development. The lack of linear sequence in these regions, particularly in centromeres, resulted in the inability to fully explore the repeat content of the human genome in the context of both local and regional chromosomal environments. RATIONALE Long-read sequencing supported the complete, telomere-to-telomere (T2T) assembly of the pseudo-haploid human cell line CHM13. This resource affords a genome-scale assessment of all human repetitive sequences, including TEs and previously unknown repeats and satellites, both within and outside of gaps and collapsed regions. Additionally, a complete genome enables the opportunity to explore the epigenetic and transcriptional profiles of these elements that are fundamental to our understanding of chromosome structure, function, and evolution. Comparative analyses reveal modes of repeat divergence, evolution, and expansion or contraction with locus-level resolution. RESULTS We implementedmore »a comprehensive repeat annotation workflow using previously known human repeats and de novo repeat modeling followed by manual curation, including assessing overlaps with gene annotations, segmental duplications, tandem repeats, and annotated repeats. Using this method, we developed an updated catalog of human repetitive sequences and refined previous repeat annotations. We discovered 43 previously unknown repeats and repeat variants and characterized 19 complex, composite repetitive structures, which often carry genes, across T2T-CHM13. Using precision nuclear run-on sequencing (PRO-seq) and CpG methylated sites generated from Oxford Nanopore Technologies long-read sequencing data, we assessed RNA polymerase engagement across retroelements genome-wide, revealing correlations between nascent transcription, sequence divergence, CpG density, and methylation. These analyses were extended to evaluate RNA polymerase occupancy for all repeats, including high-density satellite repeats that reside in previously inaccessible centromeric regions of all human chromosomes. Moreover, using both mapping-dependent and mapping-independent approaches across early developmental stages and a complete cell cycle time series, we found that engaged RNA polymerase across satellites is low; in contrast, TE transcription is abundant and serves as a boundary for changes in CpG methylation and centromere substructure. Together, these data reveal the dynamic relationship between transcriptionally active retroelement subclasses and DNA methylation, as well as potential mechanisms for the derivation and evolution of new repeat families and composite elements. Focusing on the emerging T2T-level assembly of the HG002 X chromosome, we reveal that a high level of repeat variation likely exists across the human population, including composite element copy numbers that affect gene copy number. Additionally, we highlight the impact of repeats on the structural diversity of the genome, revealing repeat expansions with extreme copy number differences between humans and primates while also providing high-confidence annotations of retroelement transduction events. CONCLUSION The comprehensive repeat annotations and updated repeat models described herein serve as a resource for expanding the compendium of human genome sequences and reveal the impact of specific repeats on the human genome. In developing this resource, we provide a methodological framework for assessing repeat variation within and between human genomes. The exhaustive assessment of the transcriptional landscape of repeats, at both the genome scale and locally, such as within centromeres, sets the stage for functional studies to disentangle the role transcription plays in the mechanisms essential for genome stability and chromosome segregation. Finally, our work demonstrates the need to increase efforts toward achieving T2T-level assemblies for nonhuman primates and other species to fully understand the complexity and impact of repeat-derived genomic innovations that define primate lineages, including humans. Telomere-to-telomere assembly of CHM13 supports repeat annotations and discoveries. The human reference T2T-CHM13 filled gaps and corrected collapsed regions (triangles) in GRCh38. Combining long read–based methylation calls, PRO-seq, and multilevel computational methods, we provide a compendium of human repeats, define retroelement expression and methylation profiles, and delineate locus-specific sites of nascent transcription genome-wide, including previously inaccessible centromeres. SINE, short interspersed element; SVA, SINE–variable number tandem repeat– Alu ; LINE, long interspersed element; LTR, long terminal repeat; TSS, transcription start site; pA, xxxxxxxxxxxxxxxx.« less
  4. Slotte, Tanja (Ed.)
    Abstract Intracellular transfers of mitochondrial DNA continue to shape nuclear genomes. Chromosome 2 of the model plant Arabidopsis thaliana contains one of the largest known nuclear insertions of mitochondrial DNA (numts). Estimated at over 600 kb in size, this numt is larger than the entire Arabidopsis mitochondrial genome. The primary Arabidopsis nuclear reference genome contains less than half of the numt because of its structural complexity and repetitiveness. Recent data sets generated with improved long-read sequencing technologies (PacBio HiFi) provide an opportunity to finally determine the accurate sequence and structure of this numt. We performed a de novo assembly using sequencing data from recent initiatives to span the Arabidopsis centromeres, producing a gap-free sequence of the Chromosome 2 numt, which is 641 kb in length and has 99.933% nucleotide sequence identity with the actual mitochondrial genome. The numt assembly is consistent with the repetitive structure previously predicted from fiber-based fluorescent in situ hybridization. Nanopore sequencing data indicate that the numt has high levels of cytosine methylation, helping to explain its biased spectrum of nucleotide sequence divergence and supporting previous inferences that it is transcriptionally inactive. The original numt insertion appears to have involved multiple mitochondrial DNA copies with alternative structures that subsequentlymore »underwent an additional duplication event within the nuclear genome. This work provides insights into numt evolution, addresses one of the last unresolved regions of the Arabidopsis reference genome, and represents a resource for distinguishing between highly similar numt and mitochondrial sequences in studies of transcription, epigenetic modifications, and de novo mutations.« less
  5. Abstract BACKGROUND

    Despite widespread interest in next-generation sequencing (NGS), the adoption of personalized clinical genomics and mutation profiling of cancer specimens is lagging, in part because of technical limitations. Tumors are genetically heterogeneous and often contain normal/stromal cells, features that lead to low-abundance somatic mutations that generate ambiguous results or reside below NGS detection limits, thus hindering the clinical sensitivity/specificity standards of mutation calling. We applied COLD-PCR (coamplification at lower denaturation temperature PCR), a PCR methodology that selectively enriches variants, to improve the detection of unknown mutations before NGS-based amplicon resequencing.


    We used both COLD-PCR and conventional PCR (for comparison) to amplify serially diluted mutation-containing cell-line DNA diluted into wild-type DNA, as well as DNA from lung adenocarcinoma and colorectal cancer samples. After amplification of TP53 (tumor protein p53), KRAS (v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog), IDH1 [isocitrate dehydrogenase 1 (NADP+), soluble], and EGFR (epidermal growth factor receptor) gene regions, PCR products were pooled for library preparation, bar-coded, and sequenced on the Illumina HiSeq 2000.


    In agreement with recent findings, sequencing errors by conventional targeted-amplicon approaches dictated a mutation-detection limit of approximately 1%–2%. Conversely, COLD-PCR amplicons enriched mutations above the error-related noise, enabling reliable identification of mutation abundances of approximatelymore »0.04%. Sequencing depth was not a large factor in the identification of COLD-PCR–enriched mutations. For the clinical samples, several missense mutations were not called with conventional amplicons, yet they were clearly detectable with COLD-PCR amplicons. Tumor heterogeneity for the TP53 gene was apparent.


    As cancer care shifts toward personalized intervention based on each patient's unique genetic abnormalities and tumor genome, we anticipate that COLD-PCR combined with NGS will elucidate the role of mutations in tumor progression, enabling NGS-based analysis of diverse clinical specimens within clinical practice.

    « less