skip to main content


Title: Comprehensive analysis of TCGA data reveals correlation between DNA methylation and alternative splicing
Abstract

The effect of DNA methylation on the regulation of gene expression has been extensively discussed in the literature. However, the potential association between DNA methylation and alternative splicing is not understood well. In this study, we integrated multiple omics data types from The Cancer Genome Atlas (TCGA) and systematically examined the relationship between DNA methylation and alternative splicing. Using the methylation data and exon expression data, we identified many CpG sites significantly associated with exon expression in various types of cancers. We further observed that the direction and strength of significant CpG-exon correlation tended to be consistent across different cancer contexts, indicating that some CpG-exon correlation patterns reflect fundamental biological mechanisms that transcend tissue- and cancer- types. We also discovered that CpG sites correlated with exon expressions were more likely to be associated with patient survival outcomes compared to CpG sites that did not correlate with exon expressions. Furthermore, we found that CpG sites were more strongly correlated with exon expression than expression of isoforms harboring the corresponding exons. This observation suggests that a major effect of CpG methylation on alternative splicing may be related to the inclusion or exclusion of exons, which subsequently impacts the relative usage of various isoforms. Overall, our study revealed correlation patterns between DNA methylation and alternative splicing, which provides new insights into the role of methylation in the transcriptional process.

 
more » « less
Award ID(s):
2007029 1552784
NSF-PAR ID:
10380901
Author(s) / Creator(s):
; ;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
BMC Genomics
Volume:
23
Issue:
1
ISSN:
1471-2164
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract Chemical modifications of proteins, DNA, and RNA moieties play critical roles in regulating gene expression. Emerging evidence suggests the RNA modifications (epitranscriptomics) have substantive roles in basic biological processes. One of the most common modifications in mRNA and noncoding RNAs is N6-methyladenosine (m6A). In a subset of mRNAs, m6A sites are preferentially enriched near stop codons, in 3′ UTRs, and within exons, suggesting an important role in the regulation of mRNA processing and function including alternative splicing and gene expression. Very little is known about the effect of environmental chemical exposure on m6A modifications. As many of the commonly occurring environmental contaminants alter gene expression profiles and have detrimental effects on physiological processes, it is important to understand the effects of exposure on this important layer of gene regulation. Hence, the objective of this study was to characterize the acute effects of developmental exposure to PCB126, an environmentally relevant dioxin-like PCB, on m6A methylation patterns. We exposed zebrafish embryos to PCB126 for 6 h starting from 72 h post fertilization and profiled m6A RNA using methylated RNA immunoprecipitation followed by sequencing (MeRIP-seq). Our analysis revealed 117 and 217 m6A peaks in the DMSO and PCB126 samples (false discovery rate 5%), respectively. The majority of the peaks were preferentially located around the 3′ UTR and stop codons. Statistical analysis revealed 15 m6A marked transcripts to be differentially methylated by PCB126 exposure. These include transcripts that are known to be activated by AHR agonists (eg, ahrra, tiparp, nfe2l2b) as well as others that are important for normal development (vgf, cebpd, sned1). These results suggest that environmental chemicals such as dioxin-like PCBs could affect developmental gene expression patterns by altering m6A levels. Further studies are necessary to understand the functional consequences of exposure-associated alterations in m6A levels. 
    more » « less
  2. null (Ed.)
    Abstract Background DNA methylation is an epigenetic event involving the addition of a methyl-group to a cytosine-guanine base pair (i.e., CpG site). It is associated with different cancers. Our research focuses on studying non-small cell lung cancer hemimethylation, which refers to methylation occurring on only one of the two DNA strands. Many studies often assume that methylation occurs on both DNA strands at a CpG site. However, recent publications show the existence of hemimethylation and its significant impact. Therefore, it is important to identify cancer hemimethylation patterns. Methods In this paper, we use the Wilcoxon signed rank test to identify hemimethylated CpG sites based on publicly available non-small cell lung cancer methylation sequencing data. We then identify two types of hemimethylated CpG clusters, regular and polarity clusters, and genes with large numbers of hemimethylated sites. Highly hemimethylated genes are then studied for their biological interactions using available bioinformatics tools. Results In this paper, we have conducted the first-ever investigation of hemimethylation in lung cancer. Our results show that hemimethylation does exist in lung cells either as singletons or clusters. Most clusters contain only two or three CpG sites. Polarity clusters are much shorter than regular clusters and appear less frequently. The majority of clusters found in tumor samples have no overlap with clusters found in normal samples, and vice versa. Several genes that are known to be associated with cancer are hemimethylated differently between the cancerous and normal samples. Furthermore, highly hemimethylated genes exhibit many different interactions with other genes that may be associated with cancer. Hemimethylation has diverse patterns and frequencies that are comparable between normal and tumorous cells. Therefore, hemimethylation may be related to both normal and tumor cell development. Conclusions Our research has identified CpG clusters and genes that are hemimethylated in normal and lung tumor samples. Due to the potential impact of hemimethylation on gene expression and cell function, these clusters and genes may be important to advance our understanding of the development and progression of non-small cell lung cancer. 
    more » « less
  3. Abstract Background

    Alternative RNA splicing is widely dysregulated in cancers including lung adenocarcinoma, where aberrant splicing events are frequently caused by somatic splice site mutations or somatic mutations of splicing factor genes. However, the majority of mis-splicing in cancers is unexplained by these known mechanisms. We hypothesize that the aberrant Ras signaling characteristic of lung cancers plays a role in promoting the alternative splicing observed in tumors.

    Methods

    We recently performed transcriptome and proteome profiling of human lung epithelial cells ectopically expressing oncogenic KRAS and another cancer-associated Ras GTPase, RIT1. Unbiased analysis of phosphoproteome data identified altered splicing factor phosphorylation in KRAS-mutant cells, so we performed differential alternative splicing analysis using rMATS to identify significantly altered isoforms in lung epithelial cells. To determine whether these isoforms were uniquely regulated by KRAS, we performed a large-scale splicing screen in which we generated over 300 unique RNA sequencing profiles of isogenic A549 lung adenocarcinoma cells ectopically expressing 75 different wild-type or variant alleles across 28 genes implicated in lung cancer.

    Results

    Mass spectrometry data showed widespread downregulation of splicing factor phosphorylation in lung epithelial cells expressing mutant KRAS compared to cells expressing wild-type KRAS. We observed alternative splicing in the same cells, with 2196 and 2416 skipped exon events in KRASG12Vand KRASQ61Hcells, respectively, 997 of which were shared (p < 0.001 by hypergeometric test). In the high-throughput splicing screen, mutant KRAS induced the greatest number of differential alternative splicing events, second only to the RNA binding protein RBM45 and its variant RBM45M126I. We identified ten high confidence cassette exon events across multiple KRAS variants and cell lines. These included differential splicing of the Myc Associated Zinc Finger (MAZ). As MAZ regulates expression of KRAS, this splice variant may be a mechanism for the cell to modulate wild-type KRAS levels in the presence of oncogenic KRAS.

    Conclusion

    Proteomic and transcriptomic profiling of lung epithelial cells uncovered splicing factor phosphorylation and mRNA splicing events regulated by oncogenic KRAS. These data suggest that in addition to widespread transcriptional changes, the Ras signaling pathway can promote post-transcriptional splicing changes that may contribute to oncogenic processes.

     
    more » « less
  4. Abstract

    Understanding the molecular profile of every human cell type is essential for understanding its role in normal physiology and disease. Technological advancements in DNA sequencing, mass spectrometry, and computational methods allow us to carry out multiomics analyses although such approaches are not routine yet. Human umbilical vein endothelial cells (HUVECs) are a widely used model system to study pathological and physiological processes associated with the cardiovascular system. In this study, next‐generation sequencing and high‐resolution mass spectrometry to profile the transcriptome and proteome of primary HUVECs is employed. Analysis of 145 million paired‐end reads from next‐generation sequencing confirmed expression of 12 186 protein‐coding genes (FPKM ≥0.1), 439 novel long non‐coding RNAs, and revealed 6089 novel isoforms that were not annotated in GENCODE. Proteomics analysis identifies 6477 proteins including confirmation ofN‐termini for 1091 proteins, isoforms for 149 proteins, and 1034 phosphosites. A database search to specifically identify other post‐translational modifications provide evidence for a number of modification sites on 117 proteins which include ubiquitylation, lysine acetylation, and mono‐, di‐ and tri‐methylation events. Evidence for 11 “missing proteins,” which are proteins for which there was insufficient or no protein level evidence, is provided. Peptides supporting missing protein and novel events are validated by comparison of MS/MS fragmentation patterns with synthetic peptides. Finally, 245 variant peptides derived from 207 expressed proteins in addition to alternate translational start sites for seven proteins and evidence for novel proteoforms for five proteins resulting from alternative splicing are identified. Overall, it is believed that the integrated approach employed in this study is widely applicable to study any primary cell type for deeper molecular characterization.

     
    more » « less
  5. Abstract

    Interrogation of chromatin modifications, such as DNA methylation, has the potential to improve forecasting and conservation of marine ecosystems. The standard method for assaying DNA methylation (whole genome bisulphite sequencing), however, is currently too costly to apply at the scales required for ecological research. Here, we evaluate different methods for measuring DNA methylation for ecological epigenetics. We compare whole genome bisulphite sequencing (WGBS) with methylated CpG binding domain sequencing (MBD‐seq), and a modified version of MethylRAD we term methylation‐dependent restriction site‐associated DNA sequencing (mdRAD). We evaluate these three assays in measuring variation in methylation across the genome, between genotypes, and between polyp types in the reef‐building coralAcropora millepora. We find that all three assays measure absolute methylation levels similarly for gene bodies (gbM), as well as exons and 1 Kb windows with a minimum Pearson correlation 0.66. Differential gbM estimates were less correlated, but still concurrent across assays. We conclude that MBD‐seq and mdRAD are reliable and cost‐effective alternatives to WGBS. The considerably lower sequencing effort required for mdRAD to produce comparable methylation estimates makes it particularly useful for ecological epigenetics.

     
    more » « less