skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: In situ genome sequencing resolves DNA sequence and structure in intact biological samples
Understanding genome organization requires integration of DNA sequence and three-dimensional spatial context; however, existing genome-wide methods lack either base pair sequence resolution or direct spatial localization. Here, we describe in situ genome sequencing (IGS), a method for simultaneously sequencing and imaging genomes within intact biological samples. We applied IGS to human fibroblasts and early mouse embryos, spatially localizing thousands of genomic loci in individual nuclei. Using these data, we characterized parent-specific changes in genome structure across embryonic stages, revealed single-cell chromatin domains in zygotes, and uncovered epigenetic memory of global chromosome positioning within individual embryos. These results demonstrate how IGS can directly connect sequence and structure across length scales from single base pairs to whole organisms.  more » « less
Award ID(s):
1764269
PAR ID:
10328932
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Science
Volume:
371
Issue:
6532
ISSN:
0036-8075
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Characterizing genome-wide binding profiles of transcription factors (TFs) is essential for understanding biological processes. Although techniques have been developed to assess binding profiles within a population of cells, determining them at a single-cell level remains elusive. Here, we report scFAN (single-cell factor analysis network), a deep learning model that predicts genome-wide TF binding profiles in individual cells. scFAN is pretrained on genome-wide bulk assay for transposase-accessible chromatin sequencing (ATAC-seq), DNA sequence, and chromatin immunoprecipitation sequencing (ChIP-seq) data and uses single-cell ATAC-seq to predict TF binding in individual cells. We demonstrate the efficacy of scFAN by both studying sequence motifs enriched within predicted binding peaks and using predicted TFs for discovering cell types. We develop a new metric “TF activity score” to characterize each cell and show that activity scores can reliably capture cell identities. scFAN allows us to discover and study cellular identities and heterogeneity based on chromatin accessibility profiles. 
    more » « less
  2. Abstract Non‐random mating among individuals can lead to spatial clustering of genetically similar individuals and population stratification. This deviation from panmixia is commonly observed in natural populations. Consequently, individuals can have parentage in single populations or involving hybridization between differentiated populations. Accounting for this mixture and structure is important when mapping the genetics of traits and learning about the formative evolutionary processes that shape genetic variation among individuals and populations. Stratified genetic relatedness among individuals is commonly quantified using estimates of ancestry that are derived from a statistical model. Development of these models for polyploid and mixed‐ploidy individuals and populations has lagged behind those for diploids. Here, we extend and test a hierarchical Bayesian model, calledentropy, which can use low‐depth sequence data to estimate genotype and ancestry parameters in autopolyploid and mixed‐ploidy individuals (including sex chromosomes and autosomes within individuals). Our analysis of simulated data illustrated the trade‐off between sequencing depth and genome coverage and found lower error associated with low‐depth sequencing across a larger fraction of the genome than with high‐depth sequencing across a smaller fraction of the genome. The model has high accuracy and sensitivity as verified with simulated data and through analysis of admixture among populations of diploid and tetraploidArabidopsis arenosa. 
    more » « less
  3. Abstract Nanopores are increasingly powerful tools for single molecule sensing, in particular, for sequencing DNA, RNA and peptides. This success has spurred efforts to sequence non-canonical nucleic acid bases and amino acids. While canonical DNA and RNA bases have pKas far from neutral, certain non-canonical bases, natural RNA modifications, and amino acids are known to have pKas near neutral pHs at which nanopore sequencing is typically performed. Previous reports have suggested that the nanopore signal may be sensitive to the protonation state of an individual moiety. We sequenced ion currents with the MspA nanopore using a single stranded DNA containing a single non-canonical DNA base (Z) at various pH conditions. The Z-base has a near-neutral pKa ∼ 7.8. We find that the measured ion current is remarkably sensitive to the protonation state of the Z-base. We demonstrate how nanopores can be used to localize and determine the pKa of individual moieties along a polymer. More broadly, these experiments provide a path to mapping different protonation sites along polymers and give insight in how to optimize sequencing of polymers that contain moieties with near-neutral pKas. 
    more » « less
  4. Abstract Pouched lamprey (Geotria australis) or kanakana/piharau is a culturally and ecologically significant jawless fish that is distributed throughout Aotearoa New Zealand. Despite its importance, much remains unknown about historical relationships and gene flow between populations of this enigmatic species within New Zealand. To help inform management, we assembled a draft Geotria australis genome and completed the first comprehensive population genomics analysis of pouched lamprey within New Zealand using targeted gene sequencing (Cyt-b and COI) and restriction site-associated DNA sequencing (RADSeq) methods. Employing 16,000 genome-wide single nucleotide polymorphisms (SNPs) derived from RADSeq (n=186) and sequence data from Cyt-b (766 bp, n=94) and COI (589 bp, n=20), we reveal low levels of structure across 10 sampling locations spanning the species range within New Zealand. F-statistics, outlier analyses, and STRUCTURE suggest a single panmictic population, and Mantel and EEMS tests reveal no significant isolation by distance. This implies either ongoing gene flow among populations or recent shared ancestry among New Zealand pouched lamprey. We can now use the information gained from these genetic tools to assist managers with monitoring effective population size, managing potential diseases, and conservation measures such as artificial propagation programs. We further demonstrate the general utility of these genetic tools for acquiring information about elusive species. 
    more » « less
  5. Insights from conservation genomics have dramatically improved recovery plans for numerous endangered species. However, most taxa have yet to benefit from the full application of genomic technologies. The mountain yellow-legged frog species complex, Rana muscosa and Rana sierrae, inhabits the Sierra Nevada mountains and Transverse/Peninsular Ranges of California and Nevada. Both species have declined precipitously throughout their historical distributions. Conservation management plans outline extensive ongoing recovery efforts but are still based on the genetic structure determined primarily using a single mitochondrial sequence. Our study used two different sequencing strategies – amplicon sequencing and exome capture – to refine our understanding of the population genetics of these imperiled amphibians. We used buccal swabs, museum tissue samples, and archived skin swabs to genotype frog populations across their range. Using the amplicon sequencing and exome capture datasets separately and combined, we document five major genetic clusters. Notably, we found evidence supporting previous species boundaries within Kings Canyon National Park with some exceptions at individual sites. Though we see evidence of genetic clustering, especially in the R. muscosa clade, we also found evidence of some admixture across cluster boundaries in the R. sierrae clade, suggesting a stepping-stone model of population structure. We also find that the southern R. muscosa cluster had large runs of homozygosity and the lowest overall heterozygosity of any of the clusters, consistent with previous reports of marked declines in this area. Overall, our results clarify management unit designations across the range of an endangered species and highlight the importance of sampling the entire range of a species, even when collecting genome-scale data. 
    more » « less