Background: Though the development of targeted cancer drugs continues to accelerate, doctors still lack reliable methods for predicting patient response to standard-of-care therapies for most cancers. DNA methylation has been implicated in tumor drug response and is a promising source of predictive biomarkers of drug efficacy, yet the relationship between drug efficacy and DNA methylation remains largely unexplored. Method: In this analysis, we performed log-rank survival analyses on patients grouped by cancer and drug exposure to find CpG sites where binary methylation status is associated with differential survival in patients treated with a specific drug but not in patients with the same cancer who were not exposed to that drug. We also clustered these drug-specific CpG sites based on co-methylation among patients to identify broader methylation patterns that may be related to drug efficacy, which we investigated for transcription factor binding site enrichment using gene set enrichment analysis. Results: We identified CpG sites that were drug-specific predictors of survival in 38 cancer-drug patient groups across 15 cancers and 20 drugs. These included 11 CpG sites with similar drug-specific survival effects in multiple cancers. We also identified 76 clusters of CpG sites with stronger associations with patient drug response, many of which contained CpG sites in gene promoters containing transcription factor binding sites. Conclusion: These findings are promising biomarkers of drug response for a variety of drugs and contribute to our understanding of drug-methylation interactions in cancer. Investigation and validation of these results could lead to the development of targeted co-therapies aimed at manipulating methylation in order to improve efficacy of commonly used therapies and could improve patient survival and quality of life by furthering the effort toward drug response prediction.
more »
« less
Characterization of Expression-Based Gene Clusters Gives Insights into Variation in Patient Response to Cancer Therapies
Background:Transcriptomics can reveal much about cellular activity, and cancer transcriptomics have been useful in investigating tumor cell behaviors. Patterns in transcriptome-wide gene expression can be used to investigate biological mechanisms and pathways that can explain the variability in patient response to cancer therapies. Methods:We identified gene expression patterns related to patient drug response by clustering tumor gene expression data and selecting from the resulting gene clusters those where expression of cluster genes was related to patient survival on specific drugs. We then investigated these gene clusters for biological meaning using several approaches, including identifying common genomic locations and transcription factors whose targets were enriched in these clusters and performing survival analyses to support these candidate transcription factor-drug relationships. Results:We identified gene clusters related to drug-specific survival, and through these, we were able to associate observed variations in patient drug response to specific known biological phenomena. Specifically, our analysis implicated 2 stem cell-related transcription factors, HOXB4 and SALL4, in poor response to temozolomide in brain cancers. In addition, expression of SNRNP70 and its targets were implicated in cetuximab response by 3 different analyses, although the mechanism remains unclear. We also found evidence that 2 cancer-related chromosomal structural changes may impact drug efficacy. Conclusion:In this study, we present the gene clusters identified and the results of our systematic analysis linking drug efficacy to specific transcription factors, which are rich sources of potential mechanistic relationships impacting patient outcomes. We also highlight the most promising of these results, which were supported by multiple analyses and by previous research. We report these findings as promising avenues for independent validation and further research into cancer treatments and patient response.
more »
« less
- Award ID(s):
- 2007029
- PAR ID:
- 10546862
- Publisher / Repository:
- Sage
- Date Published:
- Journal Name:
- Cancer Informatics
- Volume:
- 23
- ISSN:
- 1176-9351
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Abstract The ability to predict the efficacy of cancer treatments is a longstanding goal of precision medicine that requires improved understanding of molecular interactions with drugs and the discovery of biomarkers of drug response. Identifying genes whose expression influences drug sensitivity can help address both of these needs, elucidating the molecular pathways involved in drug efficacy and providing potential ways to predict new patients’ response to available therapies. In this study, we integrated cancer type, drug treatment, and survival data with RNA-seq gene expression data from The Cancer Genome Atlas to identify genes and gene sets whose expression levels in patient tumor biopsies are associated with drug-specific patient survival using a log-rank test comparing survival of patients with low vs. high expression for each gene. This analysis was successful in identifying thousands of such gene–drug relationships across 20 drugs in 14 cancers, several of which have been previously implicated in the respective drug’s efficacy. We then clustered significant genes based on their expression patterns across patients and defined gene sets that are more robust predictors of patient outcome, many of which were significantly enriched for target genes of one or more transcription factors, indicating several upstream regulatory mechanisms that may be involved in drug efficacy. We identified a large number of genes and gene sets that were potentially useful as transcript-level biomarkers for predicting drug-specific patient survival outcome. Our gene sets were robust predictors of drug-specific survival and our results included both novel and previously reported findings, suggesting that the drug-specific survival marker genes reported herein warrant further investigation for insights into drug mechanisms and for validation as biomarkers to aid cancer therapy decisions.more » « less
-
Abstract Spatial transcriptomics data play a crucial role in cancer research, providing a nuanced understanding of the spatial organization of gene expression within tumor tissues. Unraveling the spatial dynamics of gene expression can unveil key insights into tumor heterogeneity and aid in identifying potential therapeutic targets. However, in many large-scale cancer studies, spatial transcriptomics data are limited, with bulk RNA-seq and corresponding Whole Slide Image (WSI) data being more common (e.g. TCGA project). To address this gap, there is a critical need to develop methodologies that can estimate gene expression at near-cell (spot) level resolution from existing WSI and bulk RNA-seq data. This approach is essential for reanalyzing expansive cohort studies and uncovering novel biomarkers that have been overlooked in the initial assessments. In this study, we present STGAT (Spatial Transcriptomics Graph Attention Network), a novel approach leveraging Graph Attention Networks (GAT) to discern spatial dependencies among spots. Trained on spatial transcriptomics data, STGAT is designed to estimate gene expression profiles at spot-level resolution and predict whether each spot represents tumor or non-tumor tissue, especially in patient samples where only WSI and bulk RNA-seq data are available. Comprehensive tests on two breast cancer spatial transcriptomics datasets demonstrated that STGAT outperformed existing methods in accurately predicting gene expression. Further analyses using the TCGA breast cancer dataset revealed that gene expression estimated from tumor-only spots (predicted by STGAT) provides more accurate molecular signatures for breast cancer sub-type and tumor stage prediction, and also leading to improved patient survival and disease-free analysis. Availability: Code is available at https://github.com/compbiolabucf/STGAT.more » « less
-
Multiple myeloma is the second most hematological cancer. RUVBL1 and RUVBL2 form a subcomplex of many chromatin remodeling complexes implicated in cancer progression. As an inhibitor specific to the RUVBL1/2 complex, CB-6644 exhibits remarkable anti-tumor activity in xenograft models of Burkitt’s lymphoma and multiple myeloma (MM). In this work, we defined transcriptional signatures corresponding to CB-6644 treatment in MM cells and determined underlying epigenetic changes in terms of chromatin accessibility. CB-6644 upregulated biological processes related to interferon response and downregulated those linked to cell proliferation in MM cells. Transcriptional regulator inference identified E2Fs as regulators for downregulated genes and MED1 and MYC as regulators for upregulated genes. CB-6644-induced changes in chromatin accessibility occurred mostly in non-promoter regions. Footprinting analysis identified transcription factors implied in modulating chromatin accessibility in response to CB-6644 treatment, including ATF4/CEBP and IRF4. Lastly, integrative analysis of transcription responses to various chemical compounds of the molecular signature genes from public gene expression data identified CB-5083, a p97 inhibitor, as a synergistic candidate with CB-6644 in MM cells, but experimental validation refuted this hypothesis.more » « less
-
BackgroundMetastatic cancer remains one of the leading causes of cancer-related mortality worldwide. Yet, the prediction of survivability in this population remains limited by heterogeneous clinical presentations and high-dimensional molecular features. Advances in machine learning (ML) provide an opportunity to integrate diverse patient- and tumor-level factors into explainable predictive ML models. Leveraging large real-world datasets and modern ML techniques can enable improved risk stratification and precision oncology. ObjectiveThis study aimed to develop and interpret ML models for predicting overall survival in patients with metastatic cancer using the Memorial Sloan Kettering-Metastatic (MSK-MET) dataset and to identify key prognostic biomarkers through explainable artificial intelligence techniques. MethodsWe performed a retrospective analysis of the MSK-MET cohort, comprising 25,775 patients across 27 tumor types. After data cleaning and balancing, 20,338 patients were included. Overall survival was defined as deceased versus living at last follow-up. Five classifiers (extreme gradient boosting [XGBoost], logistic regression, random forest, decision tree, and naive Bayes) were trained using an 80/20 stratified split and optimized via grid search with 5-fold cross-validation. Model performance was assessed using accuracy, area under the curve (AUC), precision, recall, and F1-score. Model explainability was achieved using Shapley additive explanations (SHAP). Survival analyses included Kaplan-Meier estimates, Cox proportional hazards models, and an XGBoost-Cox model for time-to-event prediction. The positive predictive value and negative predictive value were calculated at the Youden index–optimal threshold. ResultsXGBoost achieved the highest performance (accuracy=0.74; AUC=0.82), outperforming other classifiers. In survival analyses, the XGBoost-Cox model with a concordance index (C-index) of 0.70 exceeded the traditional Cox model (C-index=0.66). SHAP analysis and Cox models consistently identified metastatic site count, tumor mutational burden, fraction of genome altered, and the presence of distant liver and bone metastases as among the strongest prognostic factors, a pattern that held at both the pan-cancer level and recurrently across cancer-specific models. At the cancer-specific level, performance varied; prostate cancer achieved the highest predictive accuracy (AUC=0.88), while pancreatic cancer was notably more challenging (AUC=0.68). Kaplan-Meier analyses demonstrated marked survival separation between patients with and without metastases (80-month survival: approximately 0.80 vs 0.30). At the Youden-optimal threshold, positive predictive value and negative predictive value were approximately 70% and 80%, respectively, supporting clinical use for risk stratification. ConclusionsExplainable ML models, particularly XGBoost combined with SHAP, can strongly predict survivability in metastatic cancers while highlighting clinically meaningful features. These findings support the use of ML-based tools for patient counseling, treatment planning, and integration into precision oncology workflows. Future work should include external validation on independent cohorts, integration with electronic health records via Fast Healthcare Interoperability Resources–based dashboards, and prospective clinician-in-the-loop evaluation to assess real-world use.more » « less
An official website of the United States government

