skip to main content


Title: Genome wide predictions of miRNA regulation by transcription factors
Abstract Motivation

Reconstructing regulatory networks from expression and interaction data is a major goal of systems biology. While much work has focused on trying to experimentally and computationally determine the set of transcription-factors (TFs) and microRNAs (miRNAs) that regulate genes in these networks, relatively little work has focused on inferring the regulation of miRNAs by TFs. Such regulation can play an important role in several biological processes including development and disease. The main challenge for predicting such interactions is the very small positive training set currently available. Another challenge is the fact that a large fraction of miRNAs are encoded within genes making it hard to determine the specific way in which they are regulated.

Results

To enable genome wide predictions of TF–miRNA interactions, we extended semi-supervised machine-learning approaches to integrate a large set of different types of data including sequence, expression, ChIP-seq and epigenetic data. As we show, the methods we develop achieve good performance on both a labeled test set, and when analyzing general co-expression networks. We next analyze mRNA and miRNA cancer expression data, demonstrating the advantage of using the predicted set of interactions for identifying more coherent and relevant modules, genes, and miRNAs. The complete set of predictions is available on the supporting website and can be used by any method that combines miRNAs, genes, and TFs.

Availability and Implementation

Code and full set of predictions are available from the supporting website: http://cs.cmu.edu/~mruffalo/tf-mirna/.

Contact

zivbj@cs.cmu.edu

Supplementary information

Supplementary data are available at Bioinformatics online.

 
more » « less
NSF-PAR ID:
10394851
Author(s) / Creator(s):
;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Bioinformatics
Volume:
32
Issue:
17
ISSN:
1367-4803
Page Range / eLocation ID:
p. i746-i754
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Aims

    Dissecting complex interactions among transcription factors (TFs), microRNAs (miRNAs) and long noncoding RNAs (lncRNAs) are central for understanding heart development and function. Although computational approaches and platforms have been described to infer relationships among regulatory factors and genes, current approaches do not adequately account for how highly diverse, interacting regulators that include noncoding RNAs (ncRNAs) control cardiac gene expression dynamics over time.

    Methods

    To overcome this limitation, we devised an integrated framework, cardiac gene regulatory modeling (CGRM) that integrates LogicTRN and regulatory component analysis bioinformatics modeling platforms to infer complex regulatory mechanisms. We then used CGRM to identify and compare the TF-ncRNA gene regulatory networks that govern early- and late-stage cardiomyocytes (CMs) generated by in vitro differentiation of human pluripotent stem cells (hPSC) and ventricular and atrial CMs isolated during in vivo human cardiac development.

    Results

    Comparisons of in vitro versus in vivo derived CMs revealed conserved regulatory networks among TFs and ncRNAs in early cells that significantly diverged in late staged cells. We report that cardiac genes (“heart targets”) expressed in early-stage hPSC-CMs are primarily regulated by MESP1, miR-1, miR-23, lncRNAs NEAT1 and MALAT1, while GATA6, HAND2, miR-200c, NEAT1 and MALAT1 are critical for late hPSC-CMs. The inferred TF-miRNA-lncRNA networks regulating heart development and contraction were similar among early-stage CMs, among individual hPSC-CM datasets and between in vitro and in vivo samples. However, genes related to apoptosis, cell cycle and proliferation, and transmembrane transport showed a high degree of divergence between in vitro and in vivo derived late-stage CMs. Overall, late-, but not early-stage CMs diverged greatly in the expression of “heart target” transcripts and their regulatory mechanisms.

    Conclusions

    In conclusion, we find that hPSC-CMs are regulated in a cell autonomous manner during early development that diverges significantly as a function of time when compared to in vivo derived CMs. These findings demonstrate the feasibility of using CGRM to reveal dynamic and complex transcriptional and posttranscriptional regulatory interactions that underlie cell directed versus environment-dependent CM development. These results with in vitro versus in vivo derived CMs thus establish this approach for detailed analyses of heart disease and for the analysis of cell regulatory systems in other biomedical fields.

     
    more » « less
  2. Abstract

    The regulation of gene expression is central to many biological processes. Gene regulatory networks (GRNs) link transcription factors (TFs) to their target genes and represent maps of potential transcriptional regulation. Here, we analyzed a large number of publically available maize (Zea mays) transcriptome data sets including >6000 RNA sequencing samples to generate 45 coexpression-based GRNs that represent potential regulatory relationships between TFs and other genes in different populations of samples (cross-tissue, cross-genotype, and tissue-and-genotype samples). While these networks are all enriched for biologically relevant interactions, different networks capture distinct TF-target associations and biological processes. By examining the power of our coexpression-based GRNs to accurately predict covarying TF-target relationships in natural variation data sets, we found that presence/absence changes rather than quantitative changes in TF gene expression are more likely associated with changes in target gene expression. Integrating information from our TF-target predictions and previous expression quantitative trait loci (eQTL) mapping results provided support for 68 TFs underlying 74 previously identified trans-eQTL hotspots spanning a variety of metabolic pathways. This study highlights the utility of developing multiple GRNs within a species to detect putative regulators of important plant pathways and provides potential targets for breeding or biotechnological applications.

     
    more » « less
  3. Abstract Summary

    Transcription factors (TFs) are proteins that directly interpret the genome to regulate gene expression and determine cellular phenotypes. TF identification is a common first step in unraveling gene regulatory networks. We present CREPE, an R Shiny app to catalogue and annotate TFs. CREPE was benchmarked against curated human TF datasets. Next, we use CREPE to explore the TF repertoires of Heliconius erato and Heliconius melpomene butterflies.

    Availability and implementation

    CREPE is available as a Shiny app package available at GitHub (github.com/dirostri/CREPE).

    Supplementary information

    Supplementary data are available at Bioinformatics Advances online.

     
    more » « less
  4. Marshall-Colon, Amy (Ed.)
    Abstract Gene regulatory networks (GRNs) are defined by a cascade of transcriptional events by which signals, such as hormones or environmental cues, change development. To understand these networks, it is necessary to link specific transcription factors (TFs) to the downstream gene targets whose expression they regulate. Although multiple methods provide information on the targets of a single TF, moving from groups of co-expressed genes to the TF that controls them is more difficult. To facilitate this bottom-up approach, we have developed a web application named TF DEACoN. This application uses a publicly available Arabidopsis thaliana DNA Affinity Purification (DAP-Seq) data set to search for TFs that show enriched binding to groups of co-regulated genes. We used TF DEACoN to examine groups of transcripts regulated by treatment with the ethylene precursor 1-aminocyclopropane-1-carboxylic acid (ACC), using a transcriptional data set performed with high temporal resolution. We demonstrate the utility of this application when co-regulated genes are divided by timing of response or cell-type-specific information, which provides more information on TF/target relationships than when less defined and larger groups of co-regulated genes are used. This approach predicted TFs that may participate in ethylene-modulated root development including the TF NAM (NO APICAL MERISTEM). We used a genetic approach to show that a mutation in NAM reduces the negative regulation of lateral root development by ACC. The combination of filtering and TF DEACoN used here can be applied to any group of co-regulated genes to predict GRNs that control coordinated transcriptional responses. 
    more » « less
  5. Lindemann, Stephen R. (Ed.)
    ABSTRACT Microorganisms must respond to environmental changes to survive, often by controlling transcription initiation. Intermittent aeration during wastewater treatment presents a cyclically changing environment to which microorganisms must react. We used an intermittently aerated bioreactor performing partial nitritation and anammox (PNA) to investigate how the microbiome responds to recurring change. Meta-transcriptomic analysis revealed a dramatic disconnect between the relative DNA abundance and gene expression within the metagenome-assembled genomes (MAGs) of community members, suggesting the importance of transcriptional regulation in this microbiome. To explore how community members responded to cyclic aeration via transcriptional regulation, we searched for homologs of the catabolite repressor protein/fumarate and nitrate reductase regulatory protein (CRP/FNR) family of transcription factors (TFs) within the MAGs. Using phylogenetic analyses, evaluation of sequence conservation in important amino acid residues, and prediction of genes regulated by TFs in the MAGs, we identified homologs of the oxygen-sensing FNR in Nitrosomonas and Rhodocyclaceae , nitrogen-sensing dissimilative nitrate respiration regulator that responds to nitrogen species (DNR) in Rhodocyclaceae , and nitrogen-sensing nitrite and nitric oxide reductase regulator that responds to nitrogen species (NnrR) in Nitrospira MAGs. Our data also predict that CRP/FNR homologs in Ignavibacteria , Flavobacteriales , and Saprospiraceae MAGs sense carbon availability. In addition, a CRP/FNR homolog in a Brocadia MAG was most closely related to CRP TFs known to sense carbon sources in well-studied organisms. However, we predict that in autotrophic Brocadia , this TF most likely regulates a diverse set of functions, including a response to stress during the cyclic aerobic/anoxic conditions. Overall, this analysis allowed us to define a meta-regulon of the PNA microbiome that explains functions and interactions of the most active community members. IMPORTANCE Microbiomes are important contributors to many ecosystems, including ones where nutrient cycling is stimulated by aeration control. Optimizing cyclic aeration helps reduce energy needs and maximize microbiome performance during wastewater treatment; however, little is known about how most microbial community members respond to these alternating conditions. We defined the meta-regulon of a PNA microbiome by combining existing knowledge of how the CRP/FNR family of bacterial TFs respond to stimuli, with metatranscriptomic analyses to characterize gene expression changes during aeration cycles. Our results indicated that, for some members of the community, prior knowledge is sufficient for high-confidence assignments of TF function, whereas other community members have CRP/FNR TFs for which inferences of function are limited by lack of prior knowledge. This study provides a framework to begin elucidating meta-regulons in microbiomes, where pure cultures are not available for traditional transcriptional regulation studies. Defining the meta-regulon can help in optimizing microbiome performance. 
    more » « less