This content will become publicly available on December 1, 2023
- ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more »
- Award ID(s):
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- Genome Biology
- Sponsoring Org:
- National Science Foundation
More Like this
Decoding co-/post-transcriptional complexities of plant transcriptomes and epitranscriptome using next-generation sequencing technologiesNext-generation sequencing (NGS) technologies - Illumina RNA-seq, Pacific Biosciences isoform sequencing (PacBio Iso-seq), and Oxford Nanopore direct RNA sequencing (DRS) - have revealed the complexity of plant transcriptomes and their regulation at the co-/post-transcriptional level. Global analysis of mature mRNAs, transcripts from nuclear run-on assays, and nascent chromatin-bound mRNAs using short as well as full-length and single-molecule DRS reads have uncovered potential roles of different forms of RNA polymerase II during the transcription process, and the extent of co-transcriptional pre-mRNA splicing and polyadenylation. These tools have also allowed mapping of transcriptome-wide start sites in cap-containing RNAs, poly(A) site choice, poly(A) tail length, and RNA base modifications. The emerging theme from recent studies is that reprogramming of gene expression in response to developmental cues and stresses at the co-/post-transcriptional level likely plays a crucial role in eliciting appropriate responses for optimal growth and plant survival under adverse conditions. Although the mechanisms by which developmental cues and different stresses regulate co-/post-transcriptional splicing are largely unknown, a few recent studies indicate that the external cues target spliceosomal and splicing regulatory proteins to modulate alternative splicing. In this review, we provide an overview of recent discoveries on the dynamics and complexities of plant transcriptomes,more »
Alternative splicing preferentially increases transcript diversity associated with stress responses in the extremophyte Schrenkiella parvulaAlternative splicing extends the coding potential of genomes by creating multiple isoforms from one gene. Isoforms can render transcript specificity and diversity to initiate multiple responses required during transcriptome adjustments in stressed environments. Although the prevalence of alternative splicing is widely recognized, how diverse isoforms facilitate stress adaptation in plants that thrive in extreme environments are unexplored. Here we examine how an extremophyte model, Schrenkiella parvula, coordinates alternative splicing in response to high salinity compared to a salt-stress sensitive model, Arabidopsis thaliana. We use Iso-Seq to generate full length reference transcripts and RNA-seq to quantify differential isoform usage in response to salinity changes. We find that single-copy orthologs where S. parvula has a higher number of isoforms than A. thaliana as well as S. parvula genes observed and predicted using machine learning to have multiple isoforms are enriched in stress associated functions. Genes that showed differential isoform usage were largely mutually exclusive from genes that were differentially expressed in response to salt. S. parvula transcriptomes maintained specificity in isoform usage assessed via a measure of expression disorderdness during transcriptome reprogramming under salt. Our study adds a novel resource and insight to study plant stress tolerance evolved in extreme environments.
APA-Scan: detection and visualization of 3′-UTR alternative polyadenylation with RNA-seq and 3′-end-seq data
The eukaryotic genome is capable of producing multiple isoforms from a gene by alternative polyadenylation (APA) during pre-mRNA processing. APA in the 3′-untranslated region (3′-UTR) of mRNA produces transcripts with shorter or longer 3′-UTR. Often, 3′-UTR serves as a binding platform for microRNAs and RNA-binding proteins, which affect the fate of the mRNA transcript. Thus, 3′-UTR APA is known to modulate translation and provides a mean to regulate gene expression at the post-transcriptional level. Current bioinformatics pipelines have limited capability in profiling 3′-UTR APA events due to incomplete annotations and a low-resolution analyzing power: widely available bioinformatics pipelines do not reference actionable polyadenylation (cleavage) sites but simulate 3′-UTR APA only using RNA-seq read coverage, causing false positive identifications. To overcome these limitations, we developed APA-Scan, a robust program that identifies 3′-UTR APA events and visualizes the RNA-seq short-read coverage with gene annotations.
APA-Scan utilizes either predicted or experimentally validated actionable polyadenylation signals as a reference for polyadenylation sites and calculates the quantity of long and short 3′-UTR transcripts in the RNA-seq data. APA-Scan works in three major steps: (i) calculate the read coverage of the 3′-UTR regions of genes; (ii) identify the potential APA sites and evaluate the significancemore »
APA-Scan was applied to both simulated and real RNA-seq datasets and compared with two widely used baselines DaPars and APAtrap. In simulation APA-Scan significantly improved the accuracy of 3′-UTR APA identification compared to the other baselines. The performance of APA-Scan was also validated by 3′-end-seq data and qPCR on mouse embryonic fibroblast cells. The experiments confirm that APA-Scan can detect unannotated 3′-UTR APA events and improve genome annotation.
APA-Scan is a comprehensive computational pipeline to detect transcriptome-wide 3′-UTR APA events. The pipeline integrates both RNA-seq and 3′-end-seq data information and can efficiently identify the significant events with a high-resolution short reads coverage plots.
Long-read RNA sequencing reveals widespread sex-specific alternative splicing in threespine stickleback fishAlternate isoforms are important contributors to phenotypic diversity across eukaryotes. Although short-read RNA-sequencing has increased our understanding of isoform diversity, it is challenging to accurately detect full-length transcripts, preventing the identification of many alternate isoforms. Long-read sequencing technologies have made it possible to sequence full-length alternative transcripts, accurately characterizing alternative splicing events, alternate transcription start and end sites, and differences in UTR regions. Here, we use Pacific Biosciences (PacBio) long-read RNA-sequencing (Iso-Seq) to examine the transcriptomes of five organs in threespine stickleback fish ( Gasterosteus aculeatus ), a widely used genetic model species. The threespine stickleback fish has a refined genome assembly in which gene annotations are based on short-read RNA sequencing and predictions from coding sequence of other species. This suggests some of the existing annotations may be inaccurate or alternative transcripts may not be fully characterized. Using Iso-Seq we detected thousands of novel isoforms, indicating many isoforms are absent in the current Ensembl gene annotations. In addition, we refined many of the existing annotations within the genome. We noted many improperly positioned transcription start sites that were refined with long-read sequencing. The Iso-Seq-predicted transcription start sites were more accurate and verified through ATAC-seq. We also detected many alternativemore »
Microbes and viruses are known to alter host transcriptomes by means of infection. In light of recent challenges posed by the COVID-19 pandemic, a deeper understanding of the disease at the transcriptome level is needed. However, research about transcriptome reprogramming by post-transcriptional regulation is very limited. In this study, computational methods developed by our lab were applied to RNA-seq data to detect transcript variants (i.e., alternative splicing (AS) and alternative polyadenylation (APA) events). The RNA-seq data were obtained from a publicly available source, and they consist of mock-treated and SARS-CoV-2 infected (COVID-19) lung alveolar (A549) cells. Data analysis results show that more AS events are found in SARS-CoV-2 infected cells than in mock-treated cells, whereas fewer APA events are detected in SARS-CoV-2 infected cells. A combination of conventional differential gene expression analysis and transcript variants analysis revealed that most of the genes with transcript variants are not differentially expressed. This indicates that no strong correlation exists between differential gene expression and the AS/APA events in the mock-treated or SARS-CoV-2 infected samples. These genes with transcript variants can be applied as another layer of molecular signatures for COVID-19 studies. In addition, the transcript variants are enriched in important biological pathways thatmore »