The regulation of gene expression is central to many biological processes. Gene regulatory networks (GRNs) link transcription factors (TFs) to their target genes and represent maps of potential transcriptional regulation. Here, we analyzed a large number of publically available maize (Zea mays) transcriptome data sets including >6000 RNA sequencing samples to generate 45 coexpression-based GRNs that represent potential regulatory relationships between TFs and other genes in different populations of samples (cross-tissue, cross-genotype, and tissue-and-genotype samples). While these networks are all enriched for biologically relevant interactions, different networks capture distinct TF-target associations and biological processes. By examining the power of our coexpression-based GRNs to accurately predict covarying TF-target relationships in natural variation data sets, we found that presence/absence changes rather than quantitative changes in TF gene expression are more likely associated with changes in target gene expression. Integrating information from our TF-target predictions and previous expression quantitative trait loci (eQTL) mapping results provided support for 68 TFs underlying 74 previously identified trans-eQTL hotspots spanning a variety of metabolic pathways. This study highlights the utility of developing multiple GRNs within a species to detect putative regulators of important plant pathways and provides potential targets for breeding or biotechnological applications.
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- The Plant Cell
- Page Range or eLocation-ID:
- p. 1377-1396
- Oxford University Press
- Sponsoring Org:
- National Science Foundation
More Like this
Systematic Discovery of Archaeal Transcription Factor Functions in Regulatory Networks through Quantitative Phenotyping AnalysisABSTRACT Gene regulatory networks (GRNs) are critical for dynamic transcriptional responses to environmental stress. However, the mechanisms by which GRN regulation adjusts physiology to enable stress survival remain unclear. Here we investigate the functions of transcription factors (TFs) within the global GRN of the stress-tolerant archaeal microorganism Halobacterium salinarum . We measured growth phenotypes of a panel of TF deletion mutants in high temporal resolution under heat shock, oxidative stress, and low-salinity conditions. To quantitate the noncanonical functional forms of the growth trajectories observed for these mutants, we developed a novel modeling framework based on Gaussian process regression and functional analysis of variance (FANOVA). We employ unique statistical tests to determine the significance of differential growth relative to the growth of the control strain. This analysis recapitulated known TF functions, revealed novel functions, and identified surprising secondary functions for characterized TFs. Strikingly, we observed that the majority of the TFs studied were required for growth under multiple stress conditions, pinpointing regulatory connections between the conditions tested. Correlations between quantitative phenotype trajectories of mutants are predictive of TF-TF connections within the GRN. These phenotypes are strongly concordant with predictions from statistical GRN models inferred from gene expression data alone. With genome-widemore »
A new tool for discovering transcriptional regulators of co-expressed genes predicts gene regulatory networks that mediate ethylene-controlled root developmentMarshall-Colon, Amy (Ed.)Abstract Gene regulatory networks (GRNs) are defined by a cascade of transcriptional events by which signals, such as hormones or environmental cues, change development. To understand these networks, it is necessary to link specific transcription factors (TFs) to the downstream gene targets whose expression they regulate. Although multiple methods provide information on the targets of a single TF, moving from groups of co-expressed genes to the TF that controls them is more difficult. To facilitate this bottom-up approach, we have developed a web application named TF DEACoN. This application uses a publicly available Arabidopsis thaliana DNA Affinity Purification (DAP-Seq) data set to search for TFs that show enriched binding to groups of co-regulated genes. We used TF DEACoN to examine groups of transcripts regulated by treatment with the ethylene precursor 1-aminocyclopropane-1-carboxylic acid (ACC), using a transcriptional data set performed with high temporal resolution. We demonstrate the utility of this application when co-regulated genes are divided by timing of response or cell-type-specific information, which provides more information on TF/target relationships than when less defined and larger groups of co-regulated genes are used. This approach predicted TFs that may participate in ethylene-modulated root development including the TF NAM (NO APICAL MERISTEM). Wemore »
ConnecTF: A platform to integrate transcription factor–gene interactions and validate regulatory networksAbstract Deciphering gene regulatory networks (GRNs) is both a promise and challenge of systems biology. The promise lies in identifying key transcription factors (TFs) that enable an organism to react to changes in its environment. The challenge lies in validating GRNs that involve hundreds of TFs with hundreds of thousands of interactions with their genome-wide targets experimentally determined by high-throughput sequencing. To address this challenge, we developed ConnecTF, a species-independent, web-based platform that integrates genome-wide studies of TF–target binding, TF–target regulation, and other TF-centric omic datasets and uses these to build and refine validated or inferred GRNs. We demonstrate the functionality of ConnecTF by showing how integration within and across TF–target datasets uncovers biological insights. Case study 1 uses integration of TF–target gene regulation and binding datasets to uncover TF mode-of-action and identify potential TF partners for 14 TFs in abscisic acid signaling. Case study 2 demonstrates how genome-wide TF–target data and automated functions in ConnecTF are used in precision/recall analysis and pruning of an inferred GRN for nitrogen signaling. Case study 3 uses ConnecTF to chart a network path from NLP7, a master TF in nitrogen signaling, to direct secondary TF2s and to its indirect targets in a Networkmore »
Expression quantitative trait loci (eQTLs), or single-nucleotide polymorphisms that affect average gene expression levels, provide important insights into context-specific gene regulation. Classic eQTL analyses use one-to-one association tests, which test gene–variant pairs individually and ignore correlations induced by gene regulatory networks and linkage disequilibrium. Probabilistic topic models, such as latent Dirichlet allocation, estimate latent topics for a collection of count observations. Prior multimodal frameworks that bridge genotype and expression data assume matched sample numbers between modalities. However, many data sets have a nested structure where one individual has several associated gene expression samples and a single germline genotype vector. Here, we build a telescoping bimodal latent Dirichlet allocation (TBLDA) framework to learn shared topics across gene expression and genotype data that allows multiple RNA sequencing samples to correspond to a single individual’s genotype. By using raw count data, our model avoids possible adulteration via normalization procedures. Ancestral structure is captured in a genotype-specific latent space, effectively removing it from shared components. Using GTEx v8 expression data across 10 tissues and genotype data, we show that the estimated topics capture meaningful and robust biological signal in both modalities and identify associations within and across tissue types. We identify 4,645 cis-eQTLs andmore »
Uterine cancer is the fourth most common cancer among women, projected to affect 66,000 US women in 2021. Uterine cancer often arises in the inner lining of the uterus, known as the endometrium, but can present as several different types of cancer, including endometrioid cancer, serous adenocarcinoma, and uterine carcinosarcoma. Previous studies have analyzed the genetic changes between normal and cancerous uterine tissue to identify specific genes of interest, including TP53 and PTEN. Here we used Gaussian Mixture Models to build condition-specific gene coexpression networks for endometrial cancer, uterine carcinosarcoma, and normal uterine tissue. We then incorporated uterine regulatory edges and investigated potential coregulation relationships. These networks were further validated using differential expression analysis, functional enrichment, and a statistical analysis comparing the expression of transcription factors and their target genes across cancerous and normal uterine samples. These networks allow for a more comprehensive look into the biological networks and pathways affected in uterine cancer compared with previous singular gene analyses. We hope this study can be incorporated into existing knowledge surrounding the genetics of uterine cancer and soon become clinical biomarkers as a tool for better prognosis and treatment.