skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A two-stream convolutional neural network for microRNA transcription start site feature integration and identification
Abstract MicroRNAs (miRNAs) play important roles in post-transcriptional gene regulation and phenotype development. Understanding the regulation of miRNA genes is critical to understand gene regulation. One of the challenges to study miRNA gene regulation is the lack of condition-specific annotation of miRNA transcription start sites (TSSs). Unlike protein-coding genes, miRNA TSSs can be tens of thousands of nucleotides away from the precursor miRNAs and they are hard to be detected by conventional RNA-Seq experiments. A number of studies have been attempted to computationally predict miRNA TSSs. However, high-resolution condition-specific miRNA TSS prediction remains a challenging problem. Recently, deep learning models have been successfully applied to various bioinformatics problems but have not been effectively created for condition-specific miRNA TSS prediction. Here we created a two-stream deep learning model called D-miRT for computational prediction of condition-specific miRNA TSSs ( http://hulab.ucf.edu/research/projects/DmiRT/ ). D-miRT is a natural fit for the integration of low-resolution epigenetic features (DNase-Seq and histone modification data) and high-resolution sequence features. Compared with alternative computational models on different sets of training data, D-miRT outperformed all baseline models and demonstrated high accuracy for condition-specific miRNA TSS prediction tasks. Comparing with the most recent approaches on cell-specific miRNA TSS identification using cell lines that were unseen to the model training processes, D-miRT also showed superior performance.  more » « less
Award ID(s):
1661414 2015838
PAR ID:
10233695
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Scientific Reports
Volume:
11
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Motivation MicroRNAs (miRNAs) are small noncoding RNAs that play important roles in gene regulation and phenotype development. The identification of miRNA transcription start sites (TSSs) is critical to understand the functional roles of miRNA genes and their transcriptional regulation. Unlike protein-coding genes, miRNA TSSs are not directly detectable from conventional RNA-Seq experiments due to miRNA-specific process of biogenesis. In the past decade, large-scale genome-wide TSS-Seq and transcription activation marker profiling data have become available, based on which, many computational methods have been developed. These methods have greatly advanced genome-wide miRNA TSS annotation. Results In this study, we summarized recent computational methods and their results on miRNA TSS annotation. We collected and performed a comparative analysis of miRNA TSS annotations from 14 representative studies. We further compiled a robust set of miRNA TSSs (RSmirT) that are supported by multiple studies. Integrative genomic and epigenomic data analysis on RSmirT revealed the genomic and epigenomic features of miRNA TSSs as well as their relations to protein-coding and long non-coding genes. Contact xiaoman@mail.ucf.edu, haihu@cs.ucf.edu 
    more » « less
  2. microRNAs (miRNA) are ~22 base pair long RNAs that play important roles in regulating gene expression.Understanding the transcriptional regulation of miRNA is critical to gene regulation. However, it is often difficult to precisely identify miRNA transcription start sites (TSSs) due to miRNA-specific biogenesis. Existing computational methods cannot effectively predict miRNA TSSs. Here, we employed deep learning architectures incorporating Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN) techniques to detect miRNA TSSs in regions of accessible chromatin. By testing on benchmark experimental data, we demonstrated that deep learning models outperform support vector machine and can accurately distinguish miRNA TSSs from both flanking regions and intergenic regions. 
    more » « less
  3. Helmer-Citterich, Manuela (Ed.)
    MicroRNAs (miRNAs) play crucial roles in gene regulation. Most studies focus on mature miRNAs, which leaves many unknowns about primary miRNAs (pri-miRNAs). To fill the gap, we attempted to model the expression of pri-miRNAs in 1829 primary cell types, cell lines, and tissues in this study. We demonstrated that the expression of pri-miRNAs can be modeled well by the expression of specific sets of mRNAs, which we termed their associated mRNAs. These associated mRNAs differ from their corresponding target mRNAs and are enriched with specific functions. Most associated mRNAs of a miRNA are shared across conditions, while on average, about one-fifth of the associated mRNAs are condition-specific. Our study shed new light on understanding miRNA biogenesis and general gene transcriptional regulation. 
    more » « less
  4. null (Ed.)
    Transition of grapevine buds from paradormancy to endodormancy is coordinated by changes in gene expression, phytohormones, transcription factors, and other molecular regulators, but the mechanisms involved in transcriptional and post-transcriptional regulation of dormancy stages are not well delineated. To identify potential regulatory targets, an integrative analysis of differential gene expression profiles and their inverse relationships with miRNA abundance was performed in paradormant (long day (LD) 15 h) or endodormant (short day (SD), 13 h) Vitis riparia buds. There were 400 up- and 936 downregulated differentially expressed genes in SD relative to LD buds. Gene set and gene ontology enrichment analysis indicated that hormone signaling and cell cycling genes were downregulated in SD relative to LD buds. miRNA abundance and inverse expression analyses of miRNA target genes indicated increased abundance of miRNAs that negatively regulate genes involved with cell cycle and meristem development in endodormant buds and miRNAs targeting starch metabolism related genes in paradormant buds. Analysis of interactions between abundant miRNAs and transcription factors identified a network with coinciding regulation of cell cycle and epigenetic regulation related genes in SD buds. This network provides evidence for cross regulation occurring between miRNA and transcription factors both upstream and downstream of MYB3R1. 
    more » « less
  5. null (Ed.)
    Regulation of gene expression starts from the transcription initiation. Regulated transcription initiation is critical for generating correct transcripts with proper abundance. The impact of epigenetic control, such as histone modifications and chromatin remodelling, on gene regulation has been extensively investigated, but their specific role in regulating transcription initiation is far from well understood. Here we aimed to better understand the roles of genes involved in histone H3 methylations and chromatin remodelling on the regulation of transcription initiation at a genome-scale using the budding yeast as a study system. We obtained and compared maps of transcription start site (TSS) at single-nucleotide resolution by nAnT-iCAGE for a strain with depletion of MINC (Mot1-Ino80C-Nc2) by Mot1p and Ino80p anchor-away (Mot1&Ino80AA) and a strain with loss of histone methylation (set1Δset2Δdot1Δ) to their wild-type controls. Our study showed that the depletion of MINC stimulated transcription initiation from many new sites flanking the dominant TSS of genes, while the loss of histone methylation generates more TSSs in the coding region. Moreover, the depletion of MINC led to less confined boundaries of TSS clusters (TCs) and resulted in broader core promoters, and such patterns are not present in the ssdΔ mutant. Our data also exhibits that the MINC has distinctive impacts on TATA-containing and TATA-less promoters. In conclusion, our study shows that MINC is required for accurate identification of bona fide TSSs, particularly in TATA-containing promoters, and histone methylation contributes to the repression of transcription initiation in coding regions. 
    more » « less