Abstract Motivation Genome-wide profiles of chromatin accessibility and gene expression in diverse cellular contexts are critical to decipher the dynamics of transcriptional regulation. Recently, convolutional neural networks have been used to learn predictive cis-regulatory DNA sequence models of context-specific chromatin accessibility landscapes. However, these context-specific regulatory sequence models cannot generalize predictions across cell types. Results We introduce multi-modal, residual neural network architectures that integrate cis-regulatory sequence and context-specific expression of trans-regulators to predict genome-wide chromatin accessibility profiles across cellular contexts. We show that the average accessibility of a genomic region across training contexts can be a surprisingly powerful predictor. We leverage this feature and employ novel strategies for training models to enhance genome-wide prediction of shared and context-specific chromatin accessible sites across cell types. We interpret the models to reveal insights into cis- and trans-regulation of chromatin dynamics across 123 diverse cellular contexts. Availability and implementation The code is available at https://github.com/kundajelab/ChromDragoNN. Supplementary information Supplementary data are available at Bioinformatics online.
more »
« less
Minimal frustration underlies the usefulness of incomplete regulatory network models in biology
Regulatory networks as large and complex as those implicated in cell-fate choice are expected to exhibit intricate, very high-dimensional dynamics. Cell-fate choice, however, is a macroscopically simple process. Additionally, regulatory network models are almost always incomplete and/or inexact, and do not incorporate all the regulators and interactions that may be involved in cell-fate regulation. In spite of these issues, regulatory network models have proven to be incredibly effective tools for understanding cell-fate choice across contexts and for making useful predictions. Here, we show that minimal frustration—a feature of biological networks across contexts but not of random networks—can compel simple, low-dimensional steady-state behavior even in large and complex networks. Moreover, the steady-state behavior of minimally frustrated networks can be recapitulated by simpler networks such as those lacking many of the nodes and edges and those that treat multiple regulators as one. The present study provides a theoretical explanation for the success of network models in biology and for the challenges in network inference.
more »
« less
- Award ID(s):
- 2019745
- PAR ID:
- 10417178
- Date Published:
- Journal Name:
- Proceedings of the National Academy of Sciences
- Volume:
- 120
- Issue:
- 1
- ISSN:
- 0027-8424
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Cell fate decisions emerge as a consequence of a complex set of gene regulatory networks. Models of these networks are known to have more parameters than data can determine. Recent work, inspired by Waddington's metaphor of a landscape, has instead tried to understand the geometry of gene regulatory networks. Here, we describe recent results on the appropriate mathematical framework for constructing these landscapes. This allows the construction of minimally parameterized models consistent with cell behavior. We review existing examples where geometrical models have been used to fit experimental data on cell fate and describe how spatial interactions between cells can be understood geometrically.more » « less
-
The dynamics of complex biological networks may be modeled in a Boolean framework, where the state of each system component is either abundant (ON) or scarce/absent (OFF), and each component's dynamic trajectory is determined by a logical update rule involving the state(s) of its regulator(s). It is possible to encode the update rules in the topology of the so-called expanded graph, analysis of which reveals the long-term behavior, or attractors, of the network. Here, we develop an algorithm to perturb the expanded graph (or, equivalently, the logical update rules) to eliminate stable motifs: subgraphs that cause a subset of components to stabilize to one state. Depending on the topology of the expanded graph, these perturbations lead to the modification or loss of the corresponding attractor. While most perturbations of biological regulatory networks in the literature involve the knockout (fixing to OFF) or constitutive activation (fixing to ON) of one or more nodes, we here consider edgetic perturbations, where a node's update rule is modified such that one or more of its regulators is viewed as ON or OFF regardless of its actual state. We apply the methodology to two biological networks. In a network representing T-LGL leukemia, we identify edgetic perturbations that eliminate the cancerous attractor, leaving only the healthy attractor representing cell death. In a network representing drought-induced closure of plant stomata, we identify edgetic perturbations that modify the single attractor such that stomata, instead of being fixed in the closed state, oscillates between the open and closed states.more » « less
-
Abstract Background Cell and circadian cycles control a large fraction of cell and organismal physiology by regulating large periodic transcriptional programs that encompass anywhere from 15 to 80% of the genome despite performing distinct functions. In each case, these large periodic transcriptional programs are controlled by gene regulatory networks (GRNs), and it has been shown through genetics and chromosome mapping approaches in model systems that at the core of these GRNs are small sets of genes that drive the transcript dynamics of the GRNs. However, it is unlikely that we have identified all of these core genes, even in model organisms. Moreover, large periodic transcriptional programs controlling a variety of processes certainly exist in important non-model organisms where genetic approaches to identifying networks are expensive, time-consuming, or intractable. Ideally, the core network components could be identified using data-driven approaches on the transcriptome dynamics data already available. Results This study shows that a unified set of quantified dynamic features of high-throughput time series gene expression data are more prominent in the core transcriptional regulators of cell and circadian cycles than in their outputs, in multiple organism, even in the presence of external periodic stimuli. Additionally, we observe that the power to discriminate between core and non-core genes is largely insensitive to the particular choice of quantification of these features. Conclusions There are practical applications of the approach presented in this study for network inference, since the result is a ranking of genes that is enriched for core regulatory elements driving a periodic phenotype. In this way, the method provides a prioritization of follow-up genetic experiments. Furthermore, these findings reveal something unexpected—that there are shared dynamic features of the transcript abundance of core components of unrelated GRNs that control disparate periodic phenotypes.more » « less
-
The emergence of and transitions between distinct phenotypes in isogenic cells can be attributed to the intricate interplay of epigenetic marks, external signals, and gene-regulatory elements. These elements include chromatin remodelers, histone modifiers, transcription factors, and regulatory RNAs. Mathematical models known as gene-regulatory networks (GRNs) are an increasingly important tool to unravel the workings of such complex networks. In such models, epigenetic factors are usually proposed to act on the chromatin regions directly involved in the expression of relevant genes. However, it has been well-established that these factors operate globally and compete with each other for targets genome-wide. Therefore, a perturbation of the activity of a regulator can redistribute epigenetic marks across the genome and modulate the levels of competing regulators. In this paper, we propose a conceptual and mathematical modeling framework that incorporates both local and global competition effects between antagonistic epigenetic regulators, in addition to local transcription factors, and show the counterintuitive consequences of such interactions. We apply our approach to recent experimental findings on the epithelial–mesenchymal transition (EMT). We show that it can explain the puzzling experimental data, as well as provide verifiable predictions.more » « less
An official website of the United States government

