Splicing is highly regulated and is modulated by numerous factors. Quantitative predictions for how a mutation will affect precursor mRNA (pre-mRNA) structure and downstream function are particularly challenging. Here, we use a novel chemical probing strategy to visualize endogenous precursor and mature MAPT mRNA structures in cells. We used these data to estimate Boltzmann suboptimal structural ensembles, which were then analyzed to predict consequences of mutations on pre-mRNA structure. Further analysis of recent cryo-EM structures of the spliceosome at different stages of the splicing cycle revealed that the footprint of the B act complex with pre-mRNA best predicted alternative splicing outcomes for exon 10 inclusion of the alternatively spliced MAPT gene, achieving 74% accuracy. We further developed a β-regression weighting framework that incorporates splice site strength, RNA structure, and exonic/intronic splicing regulatory elements capable of predicting, with 90% accuracy, the effects of 47 known and 6 newly discovered mutations on inclusion of exon 10 of MAPT . This combined experimental and computational framework represents a path forward for accurate prediction of splicing-related disease-causing variants.
more »
« less
This content will become publicly available on March 20, 2026
Generative modeling for RNA splicing predictions and design
Abstract Alternative splicing (AS) of pre-mRNA plays a crucial role in tissue-specific gene regulation, with disease implications due to splicing defects. Predicting and manipulating AS can therefore uncover new regulatory mechanisms and aid in therapeutics design. We introduce TrASPr+BOS, a generative AI model with Bayesian Optimization for predicting and designing RNA for tissue-specific splicing outcomes. TrASPr is a multi-transformer model that can handle different types of AS events and generalize to unseen cellular conditions. It then serves as an oracle, generating labeled data to train a Bayesian Optimization for Splicing (BOS) algorithm to design RNA for condition-specific splicing outcomes. We show TrASPr+BOS outperforms existing methods, enhancing tissue-specific AUPRC by up to 2.4 fold and capturing tissue-specific regulatory elements. We validate hundreds of predicted novel tissue-specific splicing variations and confirm new regulatory elements using dCas13. We envision TrASPr+BOS as a light yet accurate method researchers can probe or adopt for specific tasks.
more »
« less
- Award ID(s):
- 2400135
- PAR ID:
- 10615175
- Publisher / Repository:
- elifesciences.org
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Abstract Gene regulatory elements are central drivers of phenotypic variation and thus of critical importance towards understanding the genetics of complex traits. The Functional Annotation of Animal Genomes consortium was formed to collaboratively annotate the functional elements in animal genomes, starting with domesticated animals. Here we present an expansive collection of datasets from eight diverse tissues in three important agricultural species: chicken ( Gallus gallus ), pig ( Sus scrofa ), and cattle ( Bos taurus ). Comparative analysis of these datasets and those from the human and mouse Encyclopedia of DNA Elements projects reveal that a core set of regulatory elements are functionally conserved independent of divergence between species, and that tissue-specific transcription factor occupancy at regulatory elements and their predicted target genes are also conserved. These datasets represent a unique opportunity for the emerging field of comparative epigenomics, as well as the agricultural research community, including species that are globally important food resources.more » « less
-
CircRNAs are a category of regulatory RNAs that have garnered significant attention in the field of regulatory RNA research due to their structural stability and tissue-specific expression. Their circular configuration, formed via back-splicing, results in a covalently closed structure that exhibits greater resistance to exonucleases compared to linear RNAs. The distinctive regulation of circRNAs is closely associated with several physiological processes, as well as the advancement of pathophysiological processes in several human diseases. Despite a good understanding of the biogenesis of circular RNA, details of their biological roles are still being explored. With the steady rise in the number of investigations being carried out regarding the involvement of circRNAs in various regulatory pathways, understanding the biological and clinical relevance of circRNA-mediated regulation has become challenging. Given the vast landscape of circRNA research in the development of the heart and vasculature, we evaluated cardiovascular system research as a model to critically review the state-of-the-art understanding of the biologically relevant functions of circRNAs. We conclude the review with a discussion of the limitations of current functional studies and provide potential solutions by which these limitations can be addressed to identify and validate the meaningful and impactful functions of circRNAs in different physiological processes and diseases.more » « less
-
RBFOX2 regulated EYA3 isoforms partner with SIX4 or ZBTB1 to control transcription during myogenesisAlternative splicing is a prevalent gene-regulatory mechanism, with over 95% of multi-exon human genes estimated to be alternatively spliced. Here, we describe a tissue-specific, developmentally regulated, highly conserved, and disease-associated alternative splicing event in exon 7 of the eyes absent homolog 3 (Eya3) gene. We discovered that EYA3 expression is vital to the proliferation and differentiation of myoblasts. Genome-wide transcriptomic analysis and mass spectrometry-based proteomic studies identified SIX homeobox 4 (SIX4) and zinc finger and BTB-domain containing 1 (ZBTB1), as major transcription factors that interact with EYA3 to dictate gene expression. EYA3 isoforms differentially regulate transcription, indicating that splicing aids in temporal control of gene expression during muscle cell differentiation. Finally, we identified RNA-binding fox-1 homolog 2 (RBFOX2) as the main regulator of EYA3 splicing. Together, our findings illustrate the interplay between alternative splicing and transcription during myogenesis.more » « less
-
Massively parallel reporter assays (MPRAs) are powerful tools for quantifying the impacts of sequence variation on gene expression. Reading out molecular phenotypes with sequencing enables interrogating the impact of sequence variation beyond genome scale. Machine learning models integrate and codify information learned from MPRAs and enable generalization by predicting sequences outside the training data set. Models can provide a quantitative understanding ofcis-regulatory codes controlling gene expression, enable variant stratification, and guide the design of synthetic regulatory elements for applications from synthetic biology to mRNA and gene therapy. This review focuses oncis-regulatory MPRAs, particularly those that interrogate cotranscriptional and post-transcriptional processes: alternative splicing, cleavage and polyadenylation, translation, and mRNA decay.more » « less
An official website of the United States government
