Background: Long non-coding Ribonucleic Acids (lncRNAs) can be localized to different cellular compartments, such as the nuclear and the cytoplasmic regions. Their biological functions are influenced by the region of the cell where they are located. Compared to the vast number of lncRNAs, only a relatively small proportion have annotations regarding their subcellular localization. It would be helpful if those few annotated lncRNAs could be leveraged to develop predictive models for localization of other lncRNAs. Methods: Conventional computational methods use q-mer profiles from lncRNA sequences and train machine learning models such as support vector machines and logistic regression with the profiles. These methods focus on the exact q-mer. Given possible sequence mutations and other uncertainties in genomic sequences and their role in biological function, a consideration of these variabilities might improve our ability to model lncRNAs and their localization. Thus, we build on inexact q-mers and use machine learning/deep learning techniques to study three specific problems in lncRNA subcellular localization, namely, prediction of lncRNA localization using inexact q-mers, the issue of whether lncRNA localization is cell-type-specific, and the notion of switching (lncRNA) genes. Results: We performed our analysis using data on lncRNA localization across 15 cell lines. Our results showed that using inexact q-mers (with q = 6) can improve the lncRNA localization prediction performance compared to using exact q-mers. Further, we showed that lncRNA localization, in general, is not cell-line-specific. We also identified a category of LncRNAs which switch cellular compartments between different cell lines (we call them switching lncRNAs). These switching lncRNAs complicate the problem of predicting lncRNA localization using machine learning models, showing that lncRNA localization is still a major challenge. 
                        more » 
                        « less   
                    
                            
                            Challenges in LncRNA Biology: Views and Opinions
                        
                    
    
            This is a mini-review capturing the views and opinions of selected participants at the 2021 IEEE BIBM 3rd Annual LncRNA Workshop, held in Dubai, UAE. The views and opinions are expressed on five broad themes related to problems in lncRNA, namely, challenges in the computational analysis of lncRNAs, lncRNAs and cancer, lncRNAs in sports, lncRNAs and COVID-19, and lncRNAs in human brain activity. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10538686
- Editor(s):
- Adjeroh, Donald A; Zhou, Xiaobo; Derevyanchuk, Ekaterina G; Shkurat, Tatiana P; Martinez, Ivan; Lipovich, Leonard
- Publisher / Repository:
- MDPI
- Date Published:
- Journal Name:
- Non-Coding RNA
- Volume:
- 10
- Issue:
- 4
- ISSN:
- 2311-553X
- Page Range / eLocation ID:
- 43
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Long noncoding RNA (lncRNA) plays key roles in tumorigenesis. Misexpression of lncRNA can lead to changes in expression profiles of various target genes, which are involved in cancer initiation and progression. So, identifying key lncRNAs for a cancer would help develop the cancer therapy. Usually, to identify key lncRNAs for a cancer, expression profiles of lncRNAs for normal and cancer samples are required. But, this kind of data are not available for all cancers. In the present study, a computational framework is developed to identify cancer specific key lncRNAs using the lncRNA expression of cancer patients only. The framework consists of two state-of-the-art feature selection techniques - Recursive Feature Elimination (RFE) and Least Absolute Shrinkage and Selection Operator (LASSO); and five machine learning models - Naive Bayes, K-Nearest Neighbor, Random Forest, Support Vector Machine, and Deep Neural Network. For experiment, expression values of lncRNAs for 8 cancers - BLCA, CESC, COAD, HNSC, KIRP, LGG, LIHC, and LUAD - from TCGA are used. The combined dataset consists of 3,656 patients with expression values of 12,309 lncRNAs. Important features or key lncRNAs are identified by using feature selection algorithms RFE and LASSO. Capability of these key lncRNAs in classifying 8 different cancers is checked by the performance of five classification models. This study identified 37 key lncRNAs that can classify 8 different cancer types with an accuracy ranging from 94% to 97%. Finally, survival analysis supports that the discovered key lncRNAs are capable of differentiating between high-risk and low-risk patients.more » « less
- 
            Abstract Long noncoding RNAs (lncRNAs) are a large and diverse class of genes in eukaryotic genomes that contribute to a variety of regulatory processes. Functionally characterized lncRNAs play critical roles in plants, ranging from regulating flowering to controlling lateral root formation. However, findings from the past decade have revealed that thousands of lncRNAs are present in plant transcriptomes, and characterization has lagged far behind identification. In this setting, distinguishing function from noise is challenging. However, the plant community has been at the forefront of discovery in lncRNA biology, providing many functional and mechanistic insights that have increased our understanding of this gene class. In this review, we examine the key discoveries and insights made in plant lncRNA biology over the past two and a half decades. We describe how discoveries made in the pregenomics era have informed efforts to identify and functionally characterize lncRNAs in the subsequent decades. We provide an overview of the functional archetypes into which characterized plant lncRNAs fit and speculate on new avenues of research that may uncover yet more archetypes. Finally, this review discusses the challenges facing the field and some exciting new molecular and computational approaches that may help inform lncRNA comparative and functional analyses.more » « less
- 
            ABSTRACT Long noncoding RNA (lncRNA) plays important roles in sexual development in eukaryotes. In filamentous fungi, however, little is known about the expression and roles of lncRNAs during fruiting body formation. By profiling developmental transcriptomes during the life cycle of the plant-pathogenic fungus Fusarium graminearum , we identified 547 lncRNAs whose expression was highly dynamic, with about 40% peaking at the meiotic stage. Many lncRNAs were found to be antisense to mRNAs, forming 300 sense-antisense pairs. Although small RNAs were produced from these overlapping loci, antisense lncRNAs appeared not to be involved in gene silencing pathways. Genome-wide analysis of small RNA clusters identified many silenced loci at the meiotic stage. However, we found transcriptionally active small RNA clusters, many of which were associated with lncRNAs. Also, we observed that many antisense lncRNAs and their respective sense transcripts were induced in parallel as the fruiting bodies matured. The nonsense-mediated decay (NMD) pathway is known to determine the fates of lncRNAs as well as mRNAs. Thus, we analyzed mutants defective in NMD and identified a subset of lncRNAs that were induced during sexual development but suppressed by NMD during vegetative growth. These results highlight the developmental stage-specific nature and functional potential of lncRNA expression in shaping the fungal fruiting bodies and provide fundamental resources for studying sexual stage-induced lncRNAs. IMPORTANCE Fusarium graminearum is the causal agent of the head blight on our major staple crops, wheat and corn. The fruiting body formation on the host plants is indispensable for the disease cycle and epidemics. Long noncoding RNA (lncRNA) molecules are emerging as key regulatory components for sexual development in animals and plants. To date, however, there is a paucity of information on the roles of lncRNAs in fungal fruiting body formation. Here we characterized hundreds of lncRNAs that exhibited developmental stage-specific expression patterns during fruiting body formation. Also, we discovered that many lncRNAs were induced in parallel with their overlapping transcripts on the opposite DNA strand during sexual development. Finally, we found a subset of lncRNAs that were regulated by an RNA surveillance system during vegetative growth. This research provides fundamental genomic resources that will spur further investigations on lncRNAs that may play important roles in shaping fungal fruiting bodies.more » « less
- 
            Simon, Anne E. (Ed.)ABSTRACT Long noncoding RNAs (lncRNAs) of virus origin accumulate in cells infected by many positive-strand (+) RNA viruses to bolster viral infectivity. Their biogenesis mostly utilizes exoribonucleases of host cells that degrade viral genomic or subgenomic RNAs in the 5′-to-3′ direction until being stalled by well-defined RNA structures. Here, we report a viral lncRNA that is produced by a novel replication-dependent mechanism. This lncRNA corresponds to the last 283 nucleotides of the turnip crinkle virus (TCV) genome and hence is designated tiny TCV subgenomic RNA (ttsgR). ttsgR accumulated to high levels in TCV-infected Nicotiana benthamiana cells when the TCV-encoded RNA-dependent RNA polymerase (RdRp), also known as p88, was overexpressed. Both (+) and (−) strand forms of ttsgR were produced in a manner dependent on the RdRp functionality. Strikingly, templates as short as ttsgR itself were sufficient to program ttsgR amplification, as long as the TCV-encoded replication proteins p28 and p88 were provided in trans . Consistent with its replicational origin, ttsgR accumulation required a 5′ terminal carmovirus consensus sequence (CCS), a sequence motif shared by genomic and subgenomic RNAs of many viruses phylogenetically related to TCV. More importantly, introducing a new CCS motif elsewhere in the TCV genome was alone sufficient to cause the emergence of another lncRNA. Finally, abolishing ttsgR by mutating its 5′ CCS gave rise to a TCV mutant that failed to compete with wild-type TCV in Arabidopsis . Collectively, our results unveil a replication-dependent mechanism for the biogenesis of viral lncRNAs, thus suggesting that multiple mechanisms, individually or in combination, may be responsible for viral lncRNA production. IMPORTANCE Many positive-strand (+) RNA viruses produce long noncoding RNAs (lncRNAs) during the process of cellular infections and mobilize these lncRNAs to counteract antiviral defenses, as well as coordinate the translation of viral proteins. Most viral lncRNAs arise from 5′-to-3′ degradation of longer viral RNAs being stalled at stable secondary structures. Here, we report a viral lncRNA that is produced by the replication machinery of turnip crinkle virus (TCV). This lncRNA, designated ttsgR, shares the terminal characteristics with TCV genomic and subgenomic RNAs and overaccumulates in the presence of moderately overexpressed TCV RNA-dependent RNA polymerase (RdRp). Furthermore, templates that are of similar sizes as ttsgR are readily replicated by TCV replication proteins (p28 and RdRp) provided from nonviral sources. In summary, this study establishes an approach for uncovering low abundance viral lncRNAs, and characterizes a replicating TCV lncRNA. Similar investigations on human-pathogenic (+) RNA viruses could yield novel therapeutic targets.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    