skip to main content


Title: Prediction and characterization of transcription factors involved in drought stress response
Transcription factors (TFs) play a central role in regulating molecular level responses of plants to external stresses such as water limiting conditions, but identification of such TFs in the genome remains a challenge. Here, we describe a network-based supervised machine learning framework that accurately predicts and ranks all TFs in the genome according to their potential association with drought tolerance. We show that top ranked regulators fall mainly into two ‘age’ groups; genes that appeared first in land plants and genes that emerged later in the Oryza clade. TFs predicted to be high in the ranking belong to specific gene families, have relatively simple intron/exon and protein structures, and functionally converge to regulate primary and secondary metabolism pathways. Repeated trials of nested cross-validation tests showed that models trained only on regulatory network patterns, inferred from large transcriptome datasets, outperform models trained on heterogenous genomic features in the prediction of known drought response regulators. A new R/Shiny based web application, called the DroughtApp, provides a primer for generation of new testable hypotheses related to regulation of drought stress response. Furthermore, to test the system we experimentally validated predictions on the functional role of the rice transcription factor OsbHLH148, using RNA sequencing of knockout mutants in response to drought stress and protein-DNA interaction assays. Our study exemplifies the integration of domain knowledge for prioritization of regulatory genes in biological pathways of well-studied agricultural traits.  more » « less
Award ID(s):
1826836 1716844
NSF-PAR ID:
10172030
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
bioRxiv
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Gene regulatory networks underpin stress response pathways in plants. However, parsing these networks to prioritize key genes underlying a particular trait is challenging. Here, we have built the Gene Regulation and Association Network (GRAiN) of rice ( Oryza sativa ). GRAiN is an interactive query-based web-platform that allows users to study functional relationships between transcription factors (TFs) and genetic modules underlying abiotic-stress responses. We built GRAiN by applying a combination of different network inference algorithms to publicly available gene expression data. We propose a supervised machine learning framework that complements GRAiN in prioritizing genes that regulate stress signal transduction and modulate gene expression under drought conditions. Our framework converts intricate network connectivity patterns of 2160 TFs into a single drought score. We observed that TFs with the highest drought scores define the functional, structural, and evolutionary characteristics of drought resistance in rice. Our approach accurately predicted the function of OsbHLH148 TF, which we validated using in vitro protein-DNA binding assays and mRNA sequencing loss-of-function mutants grown under control and drought stress conditions. Our network and the complementary machine learning strategy lends itself to predicting key regulatory genes underlying other agricultural traits and will assist in the genetic engineering of desirable rice varieties. 
    more » « less
  2. Abstract

    Isoprene has recently been proposed to be a signaling molecule that can enhance tolerance of both biotic and abiotic stress. Not all plants make isoprene, but all plants tested to date respond to isoprene. We hypothesized that isoprene interacts with existing signaling pathways rather than requiring novel mechanisms for its effect on plants. We analyzed the cis‐regulatory elements (CREs) in promoters of isoprene‐responsive genes and the corresponding transcription factors binding these promoter elements to obtain clues about the transcription factors and other proteins involved in isoprene signaling. Promoter regions of isoprene‐responsive genes were characterized using the Arabidopsis cis‐regulatory element database. CREs bind ARR1, Dof, DPBF, bHLH112, GATA factors, GT‐1, MYB, and WRKY transcription factors, and light‐responsive elements were overrepresented in promoters of isoprene‐responsive genes; CBF‐, HSF‐, WUS‐binding motifs were underrepresented. Transcription factors corresponding to CREs overrepresented in promoters of isoprene‐responsive genes were mainly those important for stress responses: drought‐, salt/osmotic‐, oxidative‐, herbivory/wounding and pathogen‐stress. More than half of the isoprene‐responsive genes contained at least one binding site for TFs of the class IV (homeodomain leucine zipper) HD‐ZIP family, such as GL2, ATML1, PDF2, HDG11, ATHB17. While the HD‐zipper‐loop‐zipper (ZLZ) domain binds to the L1 box of the promoter region, a special domain called the steroidogenic acute regulatory protein‐related lipid transfer, or START domain, can bind ligands such as fatty acids (e.g., linolenic and linoleic acid). We tested whether isoprene might bind in such a START domain. Molecular simulations and modeling to test interactions between isoprene and a class IV HD‐ZIP family START‐domain‐containing protein were carried out. Without membrane penetration by the HDG11 START domain, isoprene within the lipid bilayer was inaccessible to this domain, preventing protein interactions with membrane bound isoprene. The cross‐talk between isoprene‐mediated signaling and other growth regulator and stress signaling pathways, in terms of common CREs and transcription factors could enhance the stability of the isoprene emission trait when it evolves in a plant but so far it has not been possible to say what how isoprene is sensed to initiate signaling responses.

     
    more » « less
  3. Drought is one of the most serious abiotic stressors in the environment, restricting agricultural production by reducing plant growth, development, and productivity. To investigate such a complex and multifaceted stressor and its effects on plants, a systems biology-based approach is necessitated, entailing the generation of co-expression networks, identification of high-priority transcription factors (TFs), dynamic mathematical modeling, and computational simulations. Here, we studied a high-resolution drought transcriptome of Arabidopsis. We identified distinct temporal transcriptional signatures and demonstrated the involvement of specific biological pathways. Generation of a large-scale co-expression network followed by network centrality analyses identified 117 TFs that possess critical properties of hubs, bottlenecks, and high clustering coefficient nodes. Dynamic transcriptional regulatory modeling of integrated TF targets and transcriptome datasets uncovered major transcriptional events during the course of drought stress. Mathematical transcriptional simulations allowed us to ascertain the activation status of major TFs, as well as the transcriptional intensity and amplitude of their target genes. Finally, we validated our predictions by providing experimental evidence of gene expression under drought stress for a set of four TFs and their major target genes using qRT-PCR. Taken together, we provided a systems-level perspective on the dynamic transcriptional regulation during drought stress in Arabidopsis and uncovered numerous novel TFs that could potentially be used in future genetic crop engineering programs. 
    more » « less
  4. Abstract The circadian clock helps organisms to anticipate and coordinate gene regulatory responses to changes in environmental stimuli. Under growth limiting temperatures, the time of the day modulates the accumulation of polyadenylated mRNAs. In response to heat stress, plants will conserve energy and selectively translate mRNAs. How the clock and/or the time of the day regulates polyadenylated mRNAs bound by ribosomes in response to heat stress is unknown. In-depth analysis of Arabidopsis thaliana translating mRNAs found that the time of the day gates the response of approximately one-third of the circadian-regulated heat-responsive translatome. Specifically, the time of the day and heat stress interact to prioritize the pool of mRNAs in cue to be translated. For a subset of mRNAs, we observed a stronger gated response during the day, and preferentially before the peak of expression. We propose previously overlooked transcription factors (TFs) as regulatory nodes and show that the clock plays a role in the temperature response for select TFs. When the stress was removed, the redefined priorities for translation recovered within 1 h, though slower recovery was observed for abiotic stress regulators. Through hierarchical network connections between clock genes and prioritized TFs, our work provides a framework to target key nodes underlying heat stress tolerance throughout the day. 
    more » « less
  5. ABSTRACT Gene regulatory networks (GRNs) are critical for dynamic transcriptional responses to environmental stress. However, the mechanisms by which GRN regulation adjusts physiology to enable stress survival remain unclear. Here we investigate the functions of transcription factors (TFs) within the global GRN of the stress-tolerant archaeal microorganism Halobacterium salinarum . We measured growth phenotypes of a panel of TF deletion mutants in high temporal resolution under heat shock, oxidative stress, and low-salinity conditions. To quantitate the noncanonical functional forms of the growth trajectories observed for these mutants, we developed a novel modeling framework based on Gaussian process regression and functional analysis of variance (FANOVA). We employ unique statistical tests to determine the significance of differential growth relative to the growth of the control strain. This analysis recapitulated known TF functions, revealed novel functions, and identified surprising secondary functions for characterized TFs. Strikingly, we observed that the majority of the TFs studied were required for growth under multiple stress conditions, pinpointing regulatory connections between the conditions tested. Correlations between quantitative phenotype trajectories of mutants are predictive of TF-TF connections within the GRN. These phenotypes are strongly concordant with predictions from statistical GRN models inferred from gene expression data alone. With genome-wide and targeted data sets, we provide detailed functional validation of novel TFs required for extreme oxidative stress and heat shock survival. Together, results presented in this study suggest that many TFs function under multiple conditions, thereby revealing high interconnectivity within the GRN and identifying the specific TFs required for communication between networks responding to disparate stressors. IMPORTANCE To ensure survival in the face of stress, microorganisms employ inducible damage repair pathways regulated by extensive and complex gene networks. Many archaea, microorganisms of the third domain of life, persist under extremes of temperature, salinity, and pH and under other conditions. In order to understand the cause-effect relationships between the dynamic function of the stress network and ultimate physiological consequences, this study characterized the physiological role of nearly one-third of all regulatory proteins known as transcription factors (TFs) in an archaeal organism. Using a unique quantitative phenotyping approach, we discovered functions for many novel TFs and revealed important secondary functions for known TFs. Surprisingly, many TFs are required for resisting multiple stressors, suggesting cross-regulation of stress responses. Through extensive validation experiments, we map the physiological roles of these novel TFs in stress response back to their position in the regulatory network wiring. This study advances understanding of the mechanisms underlying how microorganisms resist extreme stress. Given the generality of the methods employed, we expect that this study will enable future studies on how regulatory networks adjust cellular physiology in a diversity of organisms. 
    more » « less