skip to main content


Title: Association of CNVs with methylation variation
Abstract

Germline copy number variants (CNVs) and single-nucleotide polymorphisms (SNPs) form the basis of inter-individual genetic variation. Although the phenotypic effects of SNPs have been extensively investigated, the effects of CNVs is relatively less understood. To better characterize mechanisms by which CNVs affect cellular phenotype, we tested their association with variable CpG methylation in a genome-wide manner. Using paired CNV and methylation data from the 1000 genomes and HapMap projects, we identified genome-wide associations by methylation quantitative trait locus (mQTL) analysis. We found individual CNVs being associated with methylation of multiple CpGs and vice versa. CNV-associated methylation changes were correlated with gene expression. CNV-mQTLs were enriched for regulatory regions, transcription factor-binding sites (TFBSs), and were involved in long-range physical interactions with associated CpGs. Some CNV-mQTLs were associated with methylation of imprinted genes. Several CNV-mQTLs and/or associated genes were among those previously reported by genome-wide association studies (GWASs). We demonstrate that germline CNVs in the genome are associated with CpG methylation. Our findings suggest that structural variation together with methylation may affect cellular phenotype.

 
more » « less
NSF-PAR ID:
10193951
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
npj Genomic Medicine
Volume:
5
Issue:
1
ISSN:
2056-7944
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Low socioeconomic status (SES) and living in a disadvantaged neighborhood are associated with poor cardiovascular health. Multiple lines of evidence have linked DNA methylation to both cardiovascular risk factors and social disadvantage indicators. However, limited research has investigated the role of DNA methylation in mediating the associations of individual- and neighborhood-level disadvantage with multiple cardiovascular risk factors in large, multi-ethnic, population-based cohorts. We examined whether disadvantage at the individual level (childhood and adult SES) and neighborhood level (summary neighborhood SES as assessed by Census data and social environment as assessed by perceptions of aesthetic quality, safety, and social cohesion) were associated with 11 cardiovascular risk factors including measures of obesity, diabetes, lipids, and hypertension in 1,154 participants from the Multi-Ethnic Study of Atherosclerosis (MESA). For significant associations, we conducted epigenome-wide mediation analysis to identify methylation sites mediating the relationship between individual/neighborhood disadvantage and cardiovascular risk factors using the JT-Comp method that assesses sparse mediation effects under a composite null hypothesis. In models adjusting for age, sex, race/ethnicity, smoking, medication use, and genetic principal components of ancestry, epigenetic mediation was detected for the associations of adult SES with body mass index (BMI), insulin, and high-density lipoprotein cholesterol (HDL-C), as well as for the association between neighborhood socioeconomic disadvantage and HDL-C at FDRq< 0.05. The 410 CpG mediators identified for the SES-BMI association were enriched for CpGs associated with gene expression (expression quantitative trait methylation loci, or eQTMs), and corresponding genes were enriched in antigen processing and presentation pathways. For cardiovascular risk factors other than BMI, most of the epigenetic mediators lost significance after controlling for BMI. However, 43 methylation sites showed evidence of mediating the neighborhood socioeconomic disadvantage and HDL-C association after BMI adjustment. The identified mediators were enriched for eQTMs, and corresponding genes were enriched in inflammatory and apoptotic pathways. Our findings support the hypothesis that DNA methylation acts as a mediator between individual- and neighborhood-level disadvantage and cardiovascular risk factors, and shed light on the potential underlying epigenetic pathways. Future studies are needed to fully elucidate the biological mechanisms that link social disadvantage to poor cardiovascular health.

     
    more » « less
  2. Abstract

    Copy number variants (CNV) are established risk factors for neurodevelopmental disorders with seizures or epilepsy. With the hypothesis that seizure disorders share genetic risk factors, we pooled CNV data from 10,590 individuals with seizure disorders, 16,109 individuals with clinically validated epilepsy, and 492,324 population controls and identified 25 genome-wide significant loci, 22 of which are novel for seizure disorders, such as deletions at 1p36.33, 1q44, 2p21-p16.3, 3q29, 8p23.3-p23.2, 9p24.3, 10q26.3, 15q11.2, 15q12-q13.1, 16p12.2, 17q21.31, duplications at 2q13, 9q34.3, 16p13.3, 17q12, 19p13.3, 20q13.33, and reciprocal CNVs at 16p11.2, and 22q11.21. Using genetic data from additional 248,751 individuals with 23 neuropsychiatric phenotypes, we explored the pleiotropy of these 25 loci. Finally, in a subset of individuals with epilepsy and detailed clinical data available, we performed phenome-wide association analyses between individual CNVs and clinical annotations categorized through the Human Phenotype Ontology (HPO). For six CNVs, we identified 19 significant associations with specific HPO terms and generated, for all CNVs, phenotype signatures across 17 clinical categories relevant for epileptologists. This is the most comprehensive investigation of CNVs in epilepsy and related seizure disorders, with potential implications for clinical practice.

     
    more » « less
  3. DNA methylation occurs predominantly on cytosine-phosphate-guanine (CpG) dinucleotides in the mammalian genome, and the methylation landscape is maintained over mitotic cell division. It has been posited that coupling of maintenance methylation activity among neighbouring CpGs is critical to stability over cellular generations; however, the mechanism is unclear. We used mathematical models and stochastic simulation to analyse data from experiments that probe genome-wide methylation of nascent DNA post-replication in cells. We find that DNA methylation maintenance rates on individual CpGs are locally correlated, and the degree of this correlation varies by genomic regional context. By using theory of protein diffusion along DNA, we show that exponential decay of methylation rate correlation with genomic distance is consistent with enzyme processivity. Our results provide quantitative evidence of genome-wide methyltransferase processivity in vivo . We further developed a method to disentangle different mechanistic sources of kinetic correlations. From the experimental data, we estimate that an individual methyltransferase methylates neighbour CpGs processively if they are 36 basepairs apart, on average. But other mechanisms of coupling dominate for longer inter-CpG distances. Our study demonstrates that quantitative insights into enzymatic mechanisms can be obtained from replication-associated, cell-based genome-wide measurements, by combining data-driven statistical analyses with hypothesis-driven mathematical modelling. 
    more » « less
  4. null (Ed.)
    Stress is known to affect health throughout life and into future generations, but the underlying molecular mechanisms are unknown. We tested the hypothesis that maternal psychosocial stress influences DNA methylation (DNAm), which in turn impacts newborn health outcomes. Specifically, we analyzed DNAm at individual, regional, and genome-wide levels to test for associations with maternal stress and newborn birth weight. Maternal venous blood and newborn cord blood (n = 24 and 22, respectively) were assayed for methylation at ∼450,000 CpG sites. Methylation was analyzed by examining CpG sites individually in an epigenome-wide association study (EWAS), as regional groups using variably methylated region (VMR) analysis in maternal blood only, and through the epigenome-wide measures using genome-wide mean methylation (GMM), Horvath's epigenetic clock, and mitotic age. These methylation measures were tested for association with three measures of maternal stress (maternal war trauma, chronic stress, and experience of sexual violence) and one health outcome (newborn birth weight). We observed that maternal experiences of war trauma, chronic stress, and sexual assault were each associated with decreased newborn birth weight (p < 1.95 × 10-7 in all cases). Testing individual CpG sites using EWAS, we observed no associations between DNAm and any measure of maternal stress or newborn birth weight in either maternal or cord blood, after Bonferroni multiple testing correction. However, the top-ranked CpG site in maternal blood that associated with maternal chronic stress and sexual violence before multiple testing correction is located near the SPON1 gene. Testing at a regional level, we found increased methylation of a VMR in maternal blood near SPON1 that was associated with chronic stress and sexual violence after Bonferroni multiple testing correction (p = 1.95 × 10-7 and 8.3 × 10-6, respectively). At the epigenomic level, cord blood GMM was associated with significantly higher levels of war trauma (p = 0.025) and was suggestively associated with sexual violence (p = 0.053). The other two epigenome-wide measures were not associated with maternal stress or newborn birth weight in either tissue type. Despite our small sample size, we identified associations even after conservative multiple testing correction. Specifically, we found associations between DNAm and the three measures of maternal stress across both tissues; specifically, a VMR in maternal blood and GMM in cord blood were both associated with different measures of maternal stress. The association of cord blood GMM, but not maternal blood GMM, with maternal stress may suggest different responses to stress in mother and newborn. It is noteworthy that we found associations only when CpG sites were analyzed in aggregate, either as VMRs or as a broad summary measure of GMM. 
    more » « less
  5. ABSTRACT Genome-wide association studies (GWAS) can identify genetic variants responsible for naturally occurring and quantitative phenotypic variation. Association studies therefore provide a powerful complement to approaches that rely on de novo mutations for characterizing gene function. Although bacteria should be amenable to GWAS, few GWAS have been conducted on bacteria, and the extent to which nonindependence among genomic variants (e.g., linkage disequilibrium [LD]) and the genetic architecture of phenotypic traits will affect GWAS performance is unclear. We apply association analyses to identify candidate genes underlying variation in 20 biochemical, growth, and symbiotic phenotypes among 153 strains of Ensifer meliloti . For 11 traits, we find genotype-phenotype associations that are stronger than expected by chance, with the candidates in relatively small linkage groups, indicating that LD does not preclude resolving association candidates to relatively small genomic regions. The significant candidates show an enrichment for nucleotide polymorphisms (SNPs) over gene presence-absence variation (PAV), and for five traits, candidates are enriched in large linkage groups, a possible signature of epistasis. Many of the variants most strongly associated with symbiosis phenotypes were in genes previously identified as being involved in nitrogen fixation or nodulation. For other traits, apparently strong associations were not stronger than the range of associations detected in permuted data. In sum, our data show that GWAS in bacteria may be a powerful tool for characterizing genetic architecture and identifying genes responsible for phenotypic variation. However, careful evaluation of candidates is necessary to avoid false signals of association. IMPORTANCE Genome-wide association analyses are a powerful approach for identifying gene function. These analyses are becoming commonplace in studies of humans, domesticated animals, and crop plants but have rarely been conducted in bacteria. We applied association analyses to 20 traits measured in Ensifer meliloti , an agriculturally and ecologically important bacterium because it fixes nitrogen when in symbiosis with leguminous plants. We identified candidate alleles and gene presence-absence variants underlying variation in symbiosis traits, antibiotic resistance, and use of various carbon sources; some of these candidates are in genes previously known to affect these traits whereas others were in genes that have not been well characterized. Our results point to the potential power of association analyses in bacteria, but also to the need to carefully evaluate the potential for false associations. 
    more » « less