Abstract Circular RNAs (circRNAs) are covalently closed single‐stranded RNAs, generated through a back‐splicing process that links a downstream 5′ site to an upstream 3′ end. The only distinction in the sequence between circRNA and their linear cognate RNA is the back splice junction. Their low abundance and sequence similarity with their linear origin RNA have made the discovery and identification of circRNA challenging. We have identified almost 6000 novel circRNAs fromLotus japonicusleaf tissue using different enrichment, amplification, and sequencing methods as well as alternative bioinformatics pipelines. The different methodologies identified different pools of circRNA with little overlap. We validated circRNA identified by the different methods using reverse transcription polymerase chain reaction and characterized sequence variations using nanopore sequencing. We compared validated circRNA identified inL. japonicusto other plant species and showed conservation of high‐confidence circRNA‐expressing genes. This is the first identification ofL. japonicuscircRNA and provides a resource for further characterization of their function in gene regulation. CircRNAs identified in this study originated from genes involved in all biological functions of eukaryotic cells. The comparison of methodologies and technologies to sequence, identify, analyze, and validate circRNA from plant tissues will enable further research to characterize the function and biogenesis of circRNA inL. japonicus. 
                        more » 
                        « less   
                    
                            
                            Accurate assembly of circular RNAs with TERRACE
                        
                    
    
            Circular RNA (circRNA) is a class of RNA molecules that forms a closed loop with their 5′ and 3′ ends covalently bonded. CircRNAs are known to be more stable than linear RNAs, have distinct properties and functions, and are promising biomarkers. Existing methods for assembling circRNAs heavily rely on the annotated transcriptomes, hence exhibiting unsatisfactory accuracy without a high-quality transcriptome. We present TERRACE, a new algorithm for full-length assembly of circRNAs from paired-end total RNA-seq data. TERRACE uses the splice graph as the underlying data structure that organizes the splicing and coverage information. We transform the problem of assembling circRNAs into finding paths that “bridge” the three fragments in the splice graph induced by back-spliced reads. We adopt a definition for optimal bridging paths and a dynamic programming algorithm to calculate such optimal paths. TERRACE features an efficient algorithm to detect back-spliced reads missed by RNA-seq aligners, contributing to its much-improved sensitivity. It also incorporates a new machine-learning approach trained to assign a confidence score to each assembled circRNA, which is shown to be superior to using abundance for scoring. On both simulations and biological data sets, TERRACE consistently outperforms existing methods by a large margin in sensitivity while achieving better or comparable precision. In particular, when the annotations are not provided, TERRACE assembles 123%–413% more correct circRNAs than state-of-the-art methods. TERRACE presents a significant advance in assembling full-length circRNAs from RNA-seq data, and we expect it to be widely used in future research on circRNAs. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10552874
- Publisher / Repository:
- Cold Spring Harbor Laboratory Press
- Date Published:
- Journal Name:
- Genome Research
- Volume:
- 34
- Issue:
- 9
- ISSN:
- 1088-9051
- Page Range / eLocation ID:
- 1365 to 1370
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            DifferentialRegulation : a Bayesian hierarchical approach to identify differentially regulated genesSummary Although transcriptomics data is typically used to analyze mature spliced mRNA, recent attention has focused on jointly investigating spliced and unspliced (or precursor-) mRNA, which can be used to study gene regulation and changes in gene expression production. Nonetheless, most methods for spliced/unspliced inference (such as RNA velocity tools) focus on individual samples, and rarely allow comparisons between groups of samples (e.g. healthy vs. diseased). Furthermore, this kind of inference is challenging, because spliced and unspliced mRNA abundance is characterized by a high degree of quantification uncertainty, due to the prevalence of multi-mapping reads, ie reads compatible with multiple transcripts (or genes), and/or with both their spliced and unspliced versions. Here, we present DifferentialRegulation, a Bayesian hierarchical method to discover changes between experimental conditions with respect to the relative abundance of unspliced mRNA (over the total mRNA). We model the quantification uncertainty via a latent variable approach, where reads are allocated to their gene/transcript of origin, and to the respective splice version. We designed several benchmarks where our approach shows good performance, in terms of sensitivity and error control, vs. state-of-the-art competitors. Importantly, our tool is flexible, and works with both bulk and single-cell RNA-sequencing data. DifferentialRegulation is distributed as a Bioconductor R package.more » « less
- 
            The high-throughput short-reads RNA-seq protocols often produce paired-end reads, with the middle portion of the fragments being unsequenced. We explore if the full-length fragments can be com- putationally reconstructed from the sequenced two ends in the absence of the reference genome—a problem here we refer to as de novo bridging. Solving this problem provides longer, more infor- mative RNA-seq reads, and benefits downstream RNA-seq analysis such as transcript assembly, expression quantification, and splic- ing differential analysis. However, de novo bridging is a challeng- ing and complicated task owing to alternative splicing, transcript noises, and sequencing errors. It remains unclear if the data pro- vides sufficient information for accurate bridging, let alone efficient algorithms that determine the true bridges. Methods have been proposed to bridge paired-end reads in the presence of reference genome (called reference-based bridging), but the algorithms are far away from scaling for de novo bridging as the underlying com- pacted de Bruijn graph (cdBG) used in the latter task often contains millions of vertices and edges. We designed a new truncated Dijk- stra’s algorithm for this problem, and proposed a novel algorithm that reuses the shortest path tree to avoid running the truncated Di- jkstra’s algorithm from scratch for all vertices for further speeding up. These innovative techniques result in scalable algorithms that can bridge all paired-end reads in a cdBG with millions of vertices. Our experiments showed that paired-end RNA-seq reads can be accurately bridged to a large extent. The resulting tool is freely available at https://github.com/Shao-Group/rnabridge-denovo.more » « less
- 
            Abstract Circular RNAs (circRNAs) are covalently closed RNAs that are present in all eukaryotes tested. Recent RNA sequencing (RNA-seq) analyses indicate that although generally less abundant than messenger RNAs (mRNAs), over 1.8 million circRNA isoforms exist in humans, much more than the number of currently known mRNA isoforms. Most circRNAs are generated through backsplicing that depends on pre-mRNA structures, which are influenced by intronic elements, for example, primate-specific Alu elements, leading to species-specific circRNAs. CircRNAs are mostly cytosolic, stable and some were shown to influence cells by sequestering miRNAs and RNA-binding proteins. We review the increasing evidence that circRNAs are translated into proteins using several cap-independent translational mechanisms, that include internal ribosomal entry sites, N6-methyladenosine RNA modification, adenosine to inosine RNA editing and interaction with the eIF4A3 component of the exon junction complex. CircRNAs are translated under conditions that favor cap-independent translation, notably in cancer and generate proteins that are shorter than mRNA-encoded proteins, which can acquire new functions relevant in diseases.more » « less
- 
            CircRNAs are a category of regulatory RNAs that have garnered significant attention in the field of regulatory RNA research due to their structural stability and tissue-specific expression. Their circular configuration, formed via back-splicing, results in a covalently closed structure that exhibits greater resistance to exonucleases compared to linear RNAs. The distinctive regulation of circRNAs is closely associated with several physiological processes, as well as the advancement of pathophysiological processes in several human diseases. Despite a good understanding of the biogenesis of circular RNA, details of their biological roles are still being explored. With the steady rise in the number of investigations being carried out regarding the involvement of circRNAs in various regulatory pathways, understanding the biological and clinical relevance of circRNA-mediated regulation has become challenging. Given the vast landscape of circRNA research in the development of the heart and vasculature, we evaluated cardiovascular system research as a model to critically review the state-of-the-art understanding of the biologically relevant functions of circRNAs. We conclude the review with a discussion of the limitations of current functional studies and provide potential solutions by which these limitations can be addressed to identify and validate the meaningful and impactful functions of circRNAs in different physiological processes and diseases.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    