Long noncoding RNA (lncRNA) plays key roles in tumorigenesis. Misexpression of lncRNA can lead to changes in expression profiles of various target genes, which are involved in cancer initiation and progression. So, identifying key lncRNAs for a cancer would help develop the cancer therapy. Usually, to identify key lncRNAs for a cancer, expression profiles of lncRNAs for normal and cancer samples are required. But, this kind of data are not available for all cancers. In the present study, a computational framework is developed to identify cancer specific key lncRNAs using the lncRNA expression of cancer patients only. The framework consists of two state-of-the-art feature selection techniques - Recursive Feature Elimination (RFE) and Least Absolute Shrinkage and Selection Operator (LASSO); and five machine learning models - Naive Bayes, K-Nearest Neighbor, Random Forest, Support Vector Machine, and Deep Neural Network. For experiment, expression values of lncRNAs for 8 cancers - BLCA, CESC, COAD, HNSC, KIRP, LGG, LIHC, and LUAD - from TCGA are used. The combined dataset consists of 3,656 patients with expression values of 12,309 lncRNAs. Important features or key lncRNAs are identified by using feature selection algorithms RFE and LASSO. Capability of these key lncRNAs in classifying 8 different cancers is checked by the performance of five classification models. This study identified 37 key lncRNAs that can classify 8 different cancer types with an accuracy ranging from 94% to 97%. Finally, survival analysis supports that the discovered key lncRNAs are capable of differentiating between high-risk and low-risk patients.
more »
« less
Identification and Validation of a Novel Three Hub Long Noncoding RNAs With m6A Modification Signature in Low-Grade Gliomas
It has been evident that N6-methyladenosine (m6A)-modified long noncoding RNAs (m6A-lncRNAs) involves regulating tumorigenesis, invasion, and metastasis for various cancer types. In this study, we sought to pick computationally up a set of 13 hub m6A-lncRNAs in light of three state-of-the-art tools WGCNA, iWGCNA, and oCEM, and interrogated their prognostic values in brain low-grade gliomas (LGG). Of the 13 hub m6A-lncRNAs, we further detected three hub m6A-lncRNAs as independent prognostic risk factors, including HOXB-AS1, ELOA-AS1, and FLG-AS1 . Then, the m6ALncSig model was built based on these three hub m6A-lncRNAs. Patients with LGG next were divided into two groups, high- and low-risk, based on the median m6ALncSig score. As predicted, the high-risk group was more significantly related to mortality. The prognostic signature of m6ALncSig was validated using internal and external cohorts. In summary, our work introduces a high-confidence prognostic prediction signature and paves the way for using m6A-lncRNAs in the signature as new targets for treatment of LGG.
more »
« less
- PAR ID:
- 10327818
- Date Published:
- Journal Name:
- Frontiers in Molecular Biosciences
- Volume:
- 9
- ISSN:
- 2296-889X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Background: Long non-coding RNA plays a vital role in changing the expression profiles of various target genes that lead to cancer development. Thus, identifying prognostic lncRNAs related to different cancers might help in developing cancer therapy. Method: To discover the critical lncRNAs that can identify the origin of different cancers, we propose the use of the state-of-the-art deep learning algorithm concrete autoencoder (CAE) in an unsupervised setting, which efficiently identifies a subset of the most informative features. However, CAE does not identify reproducible features in different runs due to its stochastic nature. We thus propose a multi-run CAE (mrCAE) to identify a stable set of features to address this issue. The assumption is that a feature appearing in multiple runs carries more meaningful information about the data under consideration. The genome-wide lncRNA expression profiles of 12 different types of cancers, with a total of 4768 samples available in The Cancer Genome Atlas (TCGA), were analyzed to discover the key lncRNAs. The lncRNAs identified by multiple runs of CAE were added to a final list of key lncRNAs that are capable of identifying 12 different cancers. Results: Our results showed that mrCAE performs better in feature selection than single-run CAE, standard autoencoder (AE), and other state-of-the-art feature selection techniques. This study revealed a set of top-ranking 128 lncRNAs that could identify the origin of 12 different cancers with an accuracy of 95%. Survival analysis showed that 76 of 128 lncRNAs have the prognostic capability to differentiate high- and low-risk groups of patients with different cancers. Conclusion: The proposed mrCAE, which selects actual features, outperformed the AE even though it selects the latent or pseudo-features. By selecting actual features instead of pseudo-features, mrCAE can be valuable for precision medicine. The identified prognostic lncRNAs can be further studied to develop therapies for different cancers.more » « less
-
Abstract Previously, we have shown that apoplastic wash fluid (AWF) purified from Arabidopsis leaves contains small RNAs (sRNAs). To investigate whether these sRNAs are encapsulated inside extracellular vesicles (EVs), we treated EVs isolated from Arabidopsis leaves with the protease trypsin and RNase A, which should degrade RNAs located outside EVs but not those located inside. These analyses revealed that apoplastic RNAs are mostly located outside and are associated with proteins. Further analyses of these extracellular RNAs (exRNAs) revealed that they include both sRNAs and long noncoding RNAs (lncRNAs), including circular RNAs (circRNAs). We also found that exRNAs are highly enriched in the posttranscriptional modification N6-methyladenine (m6A). Consistent with this, we identified a putative m6A-binding protein in AWF, GLYCINE-RICH RNA-BINDING PROTEIN 7 (GRP7), as well as the sRNA-binding protein ARGONAUTE2 (AGO2). These two proteins coimmunoprecipitated with lncRNAs, including circRNAs. Mutation of GRP7 or AGO2 caused changes in both the sRNA and lncRNA content of AWF, suggesting that these proteins contribute to the secretion and/or stabilization of exRNAs. We propose that exRNAs located outside of EVs mediate host-induced gene silencing, rather than RNA located inside EVs.more » « less
-
Siegel (Ed.)Intestinal microbiota confers susceptibility to diet-induced obesity yet many probiotic species that synthesize tryptophan (trp) actually attenuate this effect, however the underlying mechanisms are unclear. We monocolonized germ-free (GF) mice with a widely consumed probiotic Lacticaseibacillus rhamnosus GG (LGG) under trp-free or -sufficient dietary conditions. We obtained untargeted metabolomics from the mouse feces and serum using liquid chromatography-mass spectrometry and obtained intestinal transcriptomic profiles via bulk-RNA sequencing. When comparing LGG-monocolonized mice with GF mice, we found a synergy between LGG and dietary trp in markedly promoting the transcriptome of fatty acid metabolism and -oxidation. Upregulation was specific and was not observed in transcriptomes of trp-fed conventional mice and mice monocolonized with Ruminococcus gnavus. Metabolomics showed that fecal and serum metabolites were also modified by LGG-host-trp interaction. We developed an R-Script based MEtabolome-TRanscriptome Correlation Analysis (METRCA) algorithm and uncovered LGG- and trp-dependent metabolites that were positively or negatively correlated with fatty acid metabolism and -oxidation gene networks. This high throughput metabolome-transcriptome correlation strategy can be used in similar investigations to reveal potential interactions between specific metabolites and functional or disease-related transcriptomic networks.more » « less
-
There is currently no gene expression assay that can assess if premalignant lesions will develop into invasive breast cancer. This study sought to identify biomarkers for selecting patients with a high potential for developing invasive carcinoma in the breast with normal histology, benign lesions, or premalignant lesions. A set of 26-gene mRNA expression profiles were used to identify invasive ductal carcinomas from histologically normal tissue and benign lesions and to select those with a higher potential for future cancer development (ADHC) in the breast associated with atypical ductal hyperplasia (ADH). The expression-defined model achieved an overall accuracy of 94.05% (AUC = 0.96) in classifying invasive ductal carcinomas from histologically normal tissue and benign lesions (n = 185). This gene signature classified cancer development in ADH tissues with an overall accuracy of 100% (n = 8). The mRNA expression patterns of these 26 genes were validated using RT-PCR analyses of independent tissue samples (n = 77) and blood samples (n = 48). The protein expression of PBX2 and RAD52 assessed with immunohistochemistry were prognostic of breast cancer survival outcomes. This signature provided significant prognostic stratification in The Cancer Genome Atlas breast cancer patients (n = 1100), as well as basal-like and luminal A subtypes, and was associated with distinct immune infiltration and activities. The mRNA and protein expression of the 26 genes was associated with sensitivity or resistance to 18 NCCN-recommended drugs for treating breast cancer. Eleven genes had significant proliferative potential in CRISPR-Cas9/RNAi screening. Based on this gene expression signature, the VEGFR inhibitor ZM-306416 was discovered as a new drug for treating breast cancer.more » « less
An official website of the United States government

