skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Application of Deep Learning Models to microRNA Transcription Start Site Identification
microRNAs (miRNA) are ~22 base pair long RNAs that play important roles in regulating gene expression.Understanding the transcriptional regulation of miRNA is critical to gene regulation. However, it is often difficult to precisely identify miRNA transcription start sites (TSSs) due to miRNA-specific biogenesis. Existing computational methods cannot effectively predict miRNA TSSs. Here, we employed deep learning architectures incorporating Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN) techniques to detect miRNA TSSs in regions of accessible chromatin. By testing on benchmark experimental data, we demonstrated that deep learning models outperform support vector machine and can accurately distinguish miRNA TSSs from both flanking regions and intergenic regions.  more » « less
Award ID(s):
1661414
PAR ID:
10104272
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
2019 IEEE 7th International Conference on Bioinformatics and Computational Biology
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract MicroRNAs (miRNAs) play important roles in post-transcriptional gene regulation and phenotype development. Understanding the regulation of miRNA genes is critical to understand gene regulation. One of the challenges to study miRNA gene regulation is the lack of condition-specific annotation of miRNA transcription start sites (TSSs). Unlike protein-coding genes, miRNA TSSs can be tens of thousands of nucleotides away from the precursor miRNAs and they are hard to be detected by conventional RNA-Seq experiments. A number of studies have been attempted to computationally predict miRNA TSSs. However, high-resolution condition-specific miRNA TSS prediction remains a challenging problem. Recently, deep learning models have been successfully applied to various bioinformatics problems but have not been effectively created for condition-specific miRNA TSS prediction. Here we created a two-stream deep learning model called D-miRT for computational prediction of condition-specific miRNA TSSs ( http://hulab.ucf.edu/research/projects/DmiRT/ ). D-miRT is a natural fit for the integration of low-resolution epigenetic features (DNase-Seq and histone modification data) and high-resolution sequence features. Compared with alternative computational models on different sets of training data, D-miRT outperformed all baseline models and demonstrated high accuracy for condition-specific miRNA TSS prediction tasks. Comparing with the most recent approaches on cell-specific miRNA TSS identification using cell lines that were unseen to the model training processes, D-miRT also showed superior performance. 
    more » « less
  2. Abstract Motivation MicroRNAs (miRNAs) are small noncoding RNAs that play important roles in gene regulation and phenotype development. The identification of miRNA transcription start sites (TSSs) is critical to understand the functional roles of miRNA genes and their transcriptional regulation. Unlike protein-coding genes, miRNA TSSs are not directly detectable from conventional RNA-Seq experiments due to miRNA-specific process of biogenesis. In the past decade, large-scale genome-wide TSS-Seq and transcription activation marker profiling data have become available, based on which, many computational methods have been developed. These methods have greatly advanced genome-wide miRNA TSS annotation. Results In this study, we summarized recent computational methods and their results on miRNA TSS annotation. We collected and performed a comparative analysis of miRNA TSS annotations from 14 representative studies. We further compiled a robust set of miRNA TSSs (RSmirT) that are supported by multiple studies. Integrative genomic and epigenomic data analysis on RSmirT revealed the genomic and epigenomic features of miRNA TSSs as well as their relations to protein-coding and long non-coding genes. Contact xiaoman@mail.ucf.edu, haihu@cs.ucf.edu 
    more » « less
  3. Regulation of gene expression is a fundamental biological process that relies on transcription factors (TF) recognizing specific cis motifs in the regulatory regions of the genes that they control. In most eukaryotic organisms, cis-regulatory elements are significantly enriched around the transcription start site (TSS). However, different from other genic features, TSSs need to be experimentally determined, becoming then important components of genome annotations. One of the methods for experimentally determining TSSs at the genome-wide level is CAGE (cap analysis of gene expression). This chapter describes how to prepare a CAGE library for sequencing, starting with RNA extraction, library construction, and quality controls before proceed to sequencing in the Illumina platform. We then describe how to use a computational pipeline to determine, from the alignment of CAGE tags, the genome-wide location of TSSs, followed with statistical approaches required to cluster TSSs that operate as transcriptional units, and to determine core promoter properties such as shape. The analyses described here focus on maize, since its large and yet deficiently annotated genome creates some unique challenges, but with some modifications can be easily adopted for other organisms as well. 
    more » « less
  4. null (Ed.)
    Transition of grapevine buds from paradormancy to endodormancy is coordinated by changes in gene expression, phytohormones, transcription factors, and other molecular regulators, but the mechanisms involved in transcriptional and post-transcriptional regulation of dormancy stages are not well delineated. To identify potential regulatory targets, an integrative analysis of differential gene expression profiles and their inverse relationships with miRNA abundance was performed in paradormant (long day (LD) 15 h) or endodormant (short day (SD), 13 h) Vitis riparia buds. There were 400 up- and 936 downregulated differentially expressed genes in SD relative to LD buds. Gene set and gene ontology enrichment analysis indicated that hormone signaling and cell cycling genes were downregulated in SD relative to LD buds. miRNA abundance and inverse expression analyses of miRNA target genes indicated increased abundance of miRNAs that negatively regulate genes involved with cell cycle and meristem development in endodormant buds and miRNAs targeting starch metabolism related genes in paradormant buds. Analysis of interactions between abundant miRNAs and transcription factors identified a network with coinciding regulation of cell cycle and epigenetic regulation related genes in SD buds. This network provides evidence for cross regulation occurring between miRNA and transcription factors both upstream and downstream of MYB3R1. 
    more » « less
  5. Abstract Accurate identification of microRNA (miRNA) targets at base-pair resolution has been an open problem for over a decade. The recent discovery of miRNA isoforms (isomiRs) adds more complexity to this problem. Despite the existence of many methods, none considers isomiRs, and their performance is still suboptimal. We hypothesize that by taking the isomiR–mRNA interactions into account and applying a deep learning model to study miRNA–mRNA interaction features, we may improve the accuracy of miRNA target predictions. We developed a deep learning tool called DMISO to capture the intricate features of miRNA/isomiR–mRNA interactions. Based on tenfold cross-validation, DMISO showed high precision (95%) and recall (90%). Evaluated on three independent datasets, DMISO had superior performance to five tools, including three popular conventional tools and two recently developed deep learning-based tools. By applying two popular feature interpretation strategies, we demonstrated the importance of the miRNA regions other than their seeds and the potential contribution of the RNA-binding motifs within miRNAs/isomiRs and mRNAs to the miRNA/isomiR–mRNA interactions. 
    more » « less