skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Cost-effective solutions for high-throughput enzymatic DNA methylation sequencing
ABSTRACT Characterizing DNA methylation patterns is important for addressing key questions in evolutionary biology, geroscience, and medical genomics. While costs are decreasing, whole-genome DNA methylation profiling remains prohibitively expensive for most population-scale studies, creating a need for cost-effective, reduced representation approaches (i.e., assays that rely on microarrays, enzyme digests, or sequence capture to target a subset of the genome). Most common whole genome and reduced representation techniques rely on bisulfite conversion, which can damage DNA resulting in DNA loss and sequencing biases. Enzymatic methyl sequencing (EM-seq) was recently proposed to overcome these issues, but thorough benchmarking of EM-seq combined with cost-effective, reduced representation strategies has not yet been performed. To do so, we optimized Targeted Methylation Sequencing protocol (TMS)—which profiles ∼4 million CpG sites—for miniaturization, flexibility, and multispecies use at a cost of ∼$80. First, we tested modifications to increase throughput and reduce cost, including increasing multiplexing, decreasing DNA input, and using enzymatic rather than mechanical fragmentation to prepare DNA. Second, we compared our optimized TMS protocol to commonly used techniques, specifically the Infinium MethylationEPIC BeadChip (n=55 paired samples) and whole genome bisulfite sequencing (n=6 paired samples). In both cases, we found strong agreement between technologies (R² = 0.97 and 0.99, respectively). Third, we tested the optimized TMS protocol in three non-human primate species (rhesus macaques, geladas, and capuchins). We captured a high percentage (mean=77.1%) of targeted CpG sites and produced methylation level estimates that agreed with those generated from reduced representation bisulfite sequencing (R² = 0.98). Finally, we applied our protocol to profile age-associated DNA methylation variation in two subsistence-level populations—the Tsimane of lowland Bolivia and the Orang Asli of Peninsular Malaysia—and found age-methylation patterns that were strikingly similar to those reported in high income cohorts, despite known differences in age-health relationships between lifestyle contexts. Altogether, our optimized TMS protocol will enable cost-effective, population-scale studies of genome-wide DNA methylation levels across human and non-human primate species.  more » « less
Award ID(s):
2235565
PAR ID:
10578937
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; « less
Publisher / Repository:
bioRxiv
Date Published:
Format(s):
Medium: X
Institution:
bioRxiv
Sponsoring Org:
National Science Foundation
More Like this
  1. Sproul, Duncan (Ed.)
    Characterizing DNA methylation patterns is important for addressing key questions in evolutionary biology, development, geroscience, and medical genomics. While costs are decreasing, whole-genome DNA methylation profiling remains prohibitively expensive for most population-scale studies, creating a need for cost-effective, reduced representation approaches (i.e., assays that rely on microarrays, enzyme digests, or sequence capture to target a subset of the genome). Most common whole genome and reduced representation techniques rely on bisulfite conversion, which can damage DNA resulting in DNA loss and sequencing biases. Enzymatic methyl sequencing (EM-seq) was recently proposed to overcome these issues, but thorough benchmarking of EM-seq combined with cost-effective, reduced representation strategies is currently lacking. To address this gap, we optimized the Targeted Methylation Sequencing protocol (TMS)—which profiles ~4 million CpG sites—for miniaturization, flexibility, and multispecies use. First, we tested modifications to increase throughput and reduce cost, including increasing multiplexing, decreasing DNA input, and using enzymatic rather than mechanical fragmentation to prepare DNA. Second, we compared our optimized TMS protocol to commonly used techniques, specifically the Infinium MethylationEPIC BeadChip (n = 55 paired samples) and whole genome bisulfite sequencing (n = 6 paired samples). In both cases, we found strong agreement between technologies (R2 = 0.97 and 0.99, respectively). Third, we tested the optimized TMS protocol in three non-human primate species (rhesus macaques, geladas, and capuchins). We captured a high percentage (mean = 77.1%) of targeted CpG sites and produced methylation level estimates that agreed with those generated from reduced representation bisulfite sequencing (R2 = 0.98). Finally, we confirmed that estimates of 1) epigenetic age and 2) tissue-specific DNA methylation patterns are strongly recapitulated using data generated from TMS versus other technologies. Altogether, our optimized TMS protocol will enable cost-effective, population-scale studies of genome-wide DNA methylation levels across human and non-human primate species. 
    more » « less
  2. Abstract There is a growing focus on the role of DNA methylation in the ability of marine invertebrates to rapidly respond to changing environmental factors and anthropogenic impacts. However, genome‐wide DNA methylation studies in nonmodel organisms are currently hampered by a limited understanding of methodological biases. Here, we compare three methods for quantifying DNA methylation at single base‐pair resolution—whole genome bisulfite sequencing (WGBS), reduced representation bisulfite sequencing (RRBS), and methyl‐CpG binding domain bisulfite sequencing (MBDBS)—using multiple individuals from two reef‐building coral species with contrasting environmental sensitivity. All methods reveal substantially greater methylation inMontipora capitata(11.4%) than the more sensitivePocillopora acuta(2.9%). The majority of CpG methylation in both species occurs in gene bodies and flanking regions. In both species, MBDBS has the greatest capacity for detecting CpGs in coding regions at our sequencing depth, but MBDBS may be influenced by intrasample methylation heterogeneity. RRBS yields robust information for specific loci albeit without enrichment of any particular genome feature and with significantly reduced genome coverage. Relative genome size strongly influences the number and location of CpGs detected by each method when sequencing depth is limited, illuminating nuances in cross‐species comparisons. As genome‐wide methylation differences, supported by data across bisulfite sequencing methods, may contribute to environmental sensitivity phenotypes in critical marine invertebrate taxa, these data provide a genomic resource for investigating the functional role of DNA methylation in environmental tolerance. 
    more » « less
  3. Abstract Interrogation of chromatin modifications, such as DNA methylation, has the potential to improve forecasting and conservation of marine ecosystems. The standard method for assaying DNA methylation (whole genome bisulphite sequencing), however, is currently too costly to apply at the scales required for ecological research. Here, we evaluate different methods for measuring DNA methylation for ecological epigenetics. We compare whole genome bisulphite sequencing (WGBS) with methylated CpG binding domain sequencing (MBD‐seq), and a modified version of MethylRAD we term methylation‐dependent restriction site‐associated DNA sequencing (mdRAD). We evaluate these three assays in measuring variation in methylation across the genome, between genotypes, and between polyp types in the reef‐building coralAcropora millepora. We find that all three assays measure absolute methylation levels similarly for gene bodies (gbM), as well as exons and 1 Kb windows with a minimum Pearson correlation 0.66. Differential gbM estimates were less correlated, but still concurrent across assays. We conclude that MBD‐seq and mdRAD are reliable and cost‐effective alternatives to WGBS. The considerably lower sequencing effort required for mdRAD to produce comparable methylation estimates makes it particularly useful for ecological epigenetics. 
    more » « less
  4. Abstract DNA methylation plays an important role in many biological processes. The mechanisms underlying the establishment and maintenance of DNA methylation are well understood thanks to decades of research using DNA methylation mutants, primarily in Arabidopsis (Arabidopsis thaliana) accession Col-0. Recent genome-wide association studies (GWASs) using the methylomes of natural accessions have uncovered a complex and distinct genetic basis of variation in DNA methylation at the population level. Sequencing following bisulfite treatment has served as an excellent method for quantifying DNA methylation. Unlike studies focusing on specific accessions with reference genomes, population-scale methylome research often requires an additional round of sequencing beyond obtaining genome assemblies or genetic variations from whole-genome sequencing data, which can be cost prohibitive. Here, we provide an overview of recently developed bisulfite-free methods for quantifying methylation and cost-effective approaches for the simultaneous detection of genetic and epigenetic information. We also discuss the plasticity of DNA methylation in a specific Arabidopsis accession, the contribution of DNA methylation to plant adaptation, and the genetic determinants of variation in DNA methylation in natural populations. The recently developed technology and knowledge will greatly benefit future studies in population epigenomes. 
    more » « less
  5. Abstract In heterozygous genomes, allele-specific measurements can reveal biologically significant differences in DNA methylation between homologous alleles associated with local changes in genetic sequence. Current approaches for detecting such events from whole-genome bisulfite sequencing (WGBS) data perform statistically independent marginal analysis at individual cytosine-phosphate-guanine (CpG) sites, thus ignoring correlations in the methylation state, or carry-out a joint statistical analysis of methylation patterns at four CpG sites producing unreliable statistical evidence. Here, we employ the one-dimensional Ising model of statistical physics and develop a method for detecting allele-specific methylation (ASM) events within segments of DNA containing clusters of linked single-nucleotide polymorphisms (SNPs), called haplotypes. Comparisons with existing approaches using simulated and real WGBS data show that our method provides an improved fit to data, especially when considering large haplotypes. Importantly, the method employs robust hypothesis testing for detecting statistically significant imbalances in mean methylation level and methylation entropy, as well as for identifying haplotypes for which the genetic variant carries significant information about the methylation state. As such, our ASM analysis approach can potentially lead to biological discoveries with important implications for the genetics of complex human diseases. 
    more » « less