skip to main content


Title: Statistical and bioinformatic analysis of hemimethylation patterns in non-small cell lung cancer
Abstract Background DNA methylation is an epigenetic event involving the addition of a methyl-group to a cytosine-guanine base pair (i.e., CpG site). It is associated with different cancers. Our research focuses on studying non-small cell lung cancer hemimethylation, which refers to methylation occurring on only one of the two DNA strands. Many studies often assume that methylation occurs on both DNA strands at a CpG site. However, recent publications show the existence of hemimethylation and its significant impact. Therefore, it is important to identify cancer hemimethylation patterns. Methods In this paper, we use the Wilcoxon signed rank test to identify hemimethylated CpG sites based on publicly available non-small cell lung cancer methylation sequencing data. We then identify two types of hemimethylated CpG clusters, regular and polarity clusters, and genes with large numbers of hemimethylated sites. Highly hemimethylated genes are then studied for their biological interactions using available bioinformatics tools. Results In this paper, we have conducted the first-ever investigation of hemimethylation in lung cancer. Our results show that hemimethylation does exist in lung cells either as singletons or clusters. Most clusters contain only two or three CpG sites. Polarity clusters are much shorter than regular clusters and appear less frequently. The majority of clusters found in tumor samples have no overlap with clusters found in normal samples, and vice versa. Several genes that are known to be associated with cancer are hemimethylated differently between the cancerous and normal samples. Furthermore, highly hemimethylated genes exhibit many different interactions with other genes that may be associated with cancer. Hemimethylation has diverse patterns and frequencies that are comparable between normal and tumorous cells. Therefore, hemimethylation may be related to both normal and tumor cell development. Conclusions Our research has identified CpG clusters and genes that are hemimethylated in normal and lung tumor samples. Due to the potential impact of hemimethylation on gene expression and cell function, these clusters and genes may be important to advance our understanding of the development and progression of non-small cell lung cancer.  more » « less
Award ID(s):
1757233
NSF-PAR ID:
10290093
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
BMC Cancer
Volume:
21
Issue:
1
ISSN:
1471-2407
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Background: Though the development of targeted cancer drugs continues to accelerate, doctors still lack reliable methods for predicting patient response to standard-of-care therapies for most cancers. DNA methylation has been implicated in tumor drug response and is a promising source of predictive biomarkers of drug efficacy, yet the relationship between drug efficacy and DNA methylation remains largely unexplored. Method: In this analysis, we performed log-rank survival analyses on patients grouped by cancer and drug exposure to find CpG sites where binary methylation status is associated with differential survival in patients treated with a specific drug but not in patients with the same cancer who were not exposed to that drug. We also clustered these drug-specific CpG sites based on co-methylation among patients to identify broader methylation patterns that may be related to drug efficacy, which we investigated for transcription factor binding site enrichment using gene set enrichment analysis. Results: We identified CpG sites that were drug-specific predictors of survival in 38 cancer-drug patient groups across 15 cancers and 20 drugs. These included 11 CpG sites with similar drug-specific survival effects in multiple cancers. We also identified 76 clusters of CpG sites with stronger associations with patient drug response, many of which contained CpG sites in gene promoters containing transcription factor binding sites. Conclusion: These findings are promising biomarkers of drug response for a variety of drugs and contribute to our understanding of drug-methylation interactions in cancer. Investigation and validation of these results could lead to the development of targeted co-therapies aimed at manipulating methylation in order to improve efficacy of commonly used therapies and could improve patient survival and quality of life by furthering the effort toward drug response prediction. 
    more » « less
  2. Over the past several decades in the United States, incidence of pancreatic cancer (PCa) has increased, with the 5-year survival rate remaining extremely low at 10.8%. Typically, PCa is diagnosed at an advanced stage, with the consequence that there is more tumor heterogeneity and increased probability that more cells are resistant to treatments. Risk factors for PCa can serve as a way to select a high-risk population and develop biomarkers to improve early detection and treatment. We focus on blood-based methylation as an approach to identify a marker set that can be obtained in a minimally invasive way (through peripheral blood) and could be applied to a high-risk subpopulation [those with recent onset type 2 diabetes (DM)]. Blood samples were collected from 30 patients, 15 had been diagnosed with PCa and 15 had been diagnosed with recent onset DM. HumanMethylationEPIC Beadchip (Illumina, CA, United States) was used to quantify methylation of approximately 850,000 methylation sites across the genome and to analyze methylation markers associated with PCa or DM or both. Exploratory analysis conducted to propose importance of top CpG (5′—C—phosphate—G—3′) methylation site associated genes and visualized using boxplots. A methylation-based age predictor was also investigated for ability to distinguish disease groups from controls. No methylation markers were observed to be significantly associated with PCa or new onset diabetes compared with control the respective control groups. In our exploratory analysis, one methylation marker, CpG04969764, found in the Laminin Subunit Alpha 5 ( LAMA5 ) gene region was observed in both PCa and DM Top 100 methylation marker sets. Modification of LAMA5 methylation or LAMA5 gene function may be a way to distinguish those recent DM cases with and without PCa, however, additional studies with larger sample sizes and different study types (e.g., cohort) will be needed to test this hypothesis. 
    more » « less
  3. Abstract Background DNA methylation dynamics in the brain are associated with normal development and neuropsychiatric disease and differ across functionally distinct brain regions. Previous studies of genome-wide methylation differences among human brain regions focus on limited numbers of individuals and one to two brain regions. Results Using GTEx samples, we generate a resource of DNA methylation in purified neuronal nuclei from 8 brain regions as well as lung and thyroid tissues from 12 to 23 donors. We identify differentially methylated regions between brain regions among neuronal nuclei in both CpG (181,146) and non-CpG (264,868) contexts, few of which were unique to a single pairwise comparison. This significantly expands the knowledge of differential methylation across the brain by 10-fold. In addition, we present the first differential methylation analysis among neuronal nuclei from basal ganglia tissues and identify unique CpG differentially methylated regions, many associated with ion transport. We also identify 81,130 regions of variably CpG methylated regions, i.e., variable methylation among individuals in the same brain region, which are enriched in regulatory regions and in CpG differentially methylated regions. Many variably methylated regions are unique to a specific brain region, with only 202 common across all brain regions, as well as lung and thyroid. Variably methylated regions identified in the amygdala, anterior cingulate cortex, and hippocampus are enriched for heritability of schizophrenia. Conclusions These data suggest that epigenetic variation in these particular human brain regions could be associated with the risk for this neuropsychiatric disorder. 
    more » « less
  4. Tumor subtype and menopausal status are strong predictors of breast cancer (BC) prognosis. We aimed to find and validate subtype- or menopausal-status-specific changes in tumor DNA methylation (DNAm) associated with all-cause mortality or BC progression. Associations between site-specific tumor DNAm and BC prognosis were estimated among The Cancer Genome Atlas participants ( n = 692) with Illumina Infinium HumanMethylation450 BeadChip array data. All-cause mortality and BC progression were modeled using Cox proportional hazards models stratified by tumor subtypes, adjusting for age, race, stage, menopausal status, tumor purity, and cell type proportion. Effect measure modification by subtype and menopausal status were evaluated by incorporating a product term with DNAm. Site-specific inference was used to identify subtype- or menopausal-status-specific differentially methylated regions (DMRs) and functional pathways. The validation of the results was carried out on an independent dataset (GSE72308; n = 180). We identified a total of fifteen unique CpG probes that were significantly associated ( P ≤ 1 × 10 − 7 with survival outcomes in subtype- or menopausal-status-specific manner. Seven probes were associated with overall survival (OS) or progression-free interval (PFI) for women with luminal A subtype, and four probes were associated with PFI for women with luminal B subtype. Five probes were associated with PFI for post-menopausal women. A majority of significant probes showed a lower risk of OS or BC progression with higher DNAm. We identified subtype- or menopausal-status-specific DMRs and functional pathways of which top associated pathways differed across subtypes or menopausal status. None of significant probes from site-specific analyses met genome-wide significant level in validation analyses while directions and magnitudes of coefficients showed consistent pattern. We have identified subtype- or menopausal-status-specific DNAm biomarkers, DMRs and functional pathways associated with all-cause mortality or BC progression, albeit with limited validation. Future studies with larger independent cohort of non-post-menopausal women with non-luminal A subtypes are warranted for identifying subtype- and menopausal-status-specific DNAm biomarkers for BC prognosis. 
    more » « less
  5. Finding genes biologically directly or indirectly related to lung cancer has been drawing much attention, and many genes directly related to lung cancer have been reported. However, it has not been confirmed whether those published 'key' genes are truly critical to lung cancer formation, i.e., they may be with very limited useful information. As a result, finding essential genes remains a challenging lung cancer research problem. Using a recently developed competing linear factor analysis method in differentially expressed gene detection, we advance the study of lung cancer critical genes detection to a uniformly informative level. A set of common four genes and their functional effects are detected to be differentially expressed in tumor and non- tumor samples with 100% sensitivity and 100% specificity in one study of lung adenocarcinoma (LUAD) and one study of squamous cell lung cancers (LUSC) (two North American cohorts with 20429 genes, 576 and 552 samples respectively). Two additional analyses also gain accuracy of 97.8% sensitivity and 100% specificity in one study of non-small cell lung carcinomas (NSCLC, a European cohort with 20356 genes and 156 samples), and an accuracy of 100% sensitivity and 95% specificity (1 out of 20 non-tumor samples) in one study of ALK-positive and EGFR/KRAS/ALK-negative lung adenocarcinomas (LUAD, a Japanese cohort with 20356 genes and 224 samples). There are some common genes, but different functional effects, within each set of four genes among two North American cohorts and a European cohort and among North American cohorts and the Japanese cohort. These results show the four-gene-based classifiers are robust with different types of lung cancers and different race cohorts and accurate. The functional effects of four genes disclose significantly other mechanisms (mysteries) between LUAD and LUSC. These sets of four genes and their functional effects are considered to be essential for lung cancer studies and practice. These genes' functional effects naturally classify patients into different groups (more than seven subtypes). Subtype information is useful for personalized therapies. The new findings can motivate new lung cancer research in more focused and targeted directions to save lives, protect people, and reduce enormous economic costs in research and lung cancer treatments. 
    more » « less