skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on May 8, 2026

Title: Optimal transport reveals dynamic gene regulatory networks via gene velocity estimation
Inferring gene regulatory networks from gene expression data is an important and challenging problem in the biology community. We propose OTVelo, a methodology that takes time-stamped single-cell gene expression data as input and predicts gene regulation across two time points. It is known that the rate of change of gene expression, which we will refer to as gene velocity, provides crucial information that enhances such inference; however, this information is not always available due to the limitations in sequencing depth. Our algorithm overcomes this limitation by estimating gene velocities using optimal transport. We then infer gene regulation using time-lagged correlation and Granger causality via regularized linear regression. Instead of providing an aggregated network across all time points, our method uncovers the underlying dynamical mechanism across time points. We validate our algorithm on 13 simulated datasets with both synthetic and curated networks and demonstrate its efficacy on 9 experimental data sets.  more » « less
Award ID(s):
2038039 2106566
PAR ID:
10589414
Author(s) / Creator(s):
; ; ;
Editor(s):
Zhang, Xiuwei
Publisher / Repository:
PLOS
Date Published:
Journal Name:
PLOS Computational Biology
Volume:
21
Issue:
5
ISSN:
1553-7358
Page Range / eLocation ID:
e1012476
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Cowen, Lenore (Ed.)
    Abstract Summary Designing interventions to control gene regulation necessitates modeling a gene regulatory network by a causal graph. Currently, large-scale gene expression datasets from different conditions, cell types, disease states, and developmental time points are being collected. However, application of classical causal inference algorithms to infer gene regulatory networks based on such data is still challenging, requiring high sample sizes and computational resources. Here, we describe an algorithm that efficiently learns the differences in gene regulatory mechanisms between different conditions. Our difference causal inference (DCI) algorithm infers changes (i.e. edges that appeared, disappeared, or changed weight) between two causal graphs given gene expression data from the two conditions. This algorithm is efficient in its use of samples and computation since it infers the differences between causal graphs directly without estimating each possibly large causal graph separately. We provide a user-friendly Python implementation of DCI and also enable the user to learn the most robust difference causal graph across different tuning parameters via stability selection. Finally, we show how to apply DCI to single-cell RNA-seq data from different conditions and cell states, and we also validate our algorithm by predicting the effects of interventions. Availability and implementation Python package freely available at http://uhlerlab.github.io/causaldag/dci. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  2. In biology, cell-fate decisions are controlled by complex gene regulation. Although gene expression data may be collected at multiple time points, it remains difficult to construct the continuous dynamics from the data. In this work, we developed a data-driven approach, NeuralGene, a model based on neural ordinary differential equations (ODEs), to reconstruct continuous dynamical systemsgoverning gene regulation from temporal gene expression data. In addition, NeuralGene has the flexibility of incorporating partial prior biological information in the model to further improve its accuracy. For a given cell at a static time point, the NeuralGene model can impute its continuous gene expression dynamics and predict its cell fate. We applied NeuralGene to a simulation toggle-switch model to verify its utility in modeling and reconstructing temporal dynamics. In addition, NeuralGene was applied to experimental single-cell qPCR data to show its ability for gene expression imputation and cell-fate prediction. 
    more » « less
  3. Abstract The regulation of gene expression is central to many biological processes. Gene regulatory networks (GRNs) link transcription factors (TFs) to their target genes and represent maps of potential transcriptional regulation. Here, we analyzed a large number of publically available maize (Zea mays) transcriptome data sets including >6000 RNA sequencing samples to generate 45 coexpression-based GRNs that represent potential regulatory relationships between TFs and other genes in different populations of samples (cross-tissue, cross-genotype, and tissue-and-genotype samples). While these networks are all enriched for biologically relevant interactions, different networks capture distinct TF-target associations and biological processes. By examining the power of our coexpression-based GRNs to accurately predict covarying TF-target relationships in natural variation data sets, we found that presence/absence changes rather than quantitative changes in TF gene expression are more likely associated with changes in target gene expression. Integrating information from our TF-target predictions and previous expression quantitative trait loci (eQTL) mapping results provided support for 68 TFs underlying 74 previously identified trans-eQTL hotspots spanning a variety of metabolic pathways. This study highlights the utility of developing multiple GRNs within a species to detect putative regulators of important plant pathways and provides potential targets for breeding or biotechnological applications. 
    more » « less
  4. Abstract Allele-specific expression quantification from RNA-seq reads provides opportunities to study the control of gene regulatory networks bycis-acting andtrans-acting genetic variants. Many existing methods performed a single-gene and single-SNP association analysis to identify expression quantitative trait loci (eQTLs), and placed the eQTLs against known gene networks for functional interpretation. Instead, we view eQTL data as a capture of the effects of perturbation of gene regulatory system by a large number of genetic variants and reconstruct a gene network perturbed by eQTLs. We introduce a statistical framework called CiTruss for simultaneously learning a gene network andcis-acting andtrans-acting eQTLs that perturb this network, given population allele-specific expression and SNP data. CiTruss uses a multi-level conditional Gaussian graphical model to modeltrans-acting eQTLs perturbing the expression of both alleles in gene network at the top level andcis-acting eQTLs perturbing the expression of each allele at the bottom level. We derive a transformation of this model that allows efficient learning for large-scale human data. Our analysis of the GTEx and LG×SM advanced intercross line mouse data for multiple tissue types with CiTruss provides new insights into genetics of gene regulation. CiTruss revealed that gene networks consist of local subnetworks over proximally located genes and global subnetworks over genes scattered across genome, and that several aspects of gene regulation by eQTLs such as the impact of genetic diversity, pleiotropy, tissue-specific gene regulation, and local and long-range linkage disequilibrium among eQTLs can be explained through these local and global subnetworks. 
    more » « less
  5. Cherry, J M (Ed.)
    Abstract The mechanisms that coordinate cellular gene expression are highly complex and intricately interconnected. Thus, it is necessary to move beyond a fully reductionist approach to understanding genetic information flow and begin focusing on the networked connections between genes that organize cellular function. Continued advancements in computational hardware, coupled with the development of gene correlation network algorithms, provide the capacity to study networked interactions between genes rather than their isolated functions. For example, gene coexpression networks are used to construct gene relationship networks using linear metrics such as Spearman or Pearson correlation. Recently, there have been tools designed to deepen these analyses by differentiating between intrinsic vs extrinsic noise within gene expression values, identifying different modules based on tissue phenotype, and capturing potential nonlinear relationships. In this report, we introduce an algorithm with a novel application of image-based segmentation modalities utilizing blob detection techniques applied for detecting bigenic edges in a gene expression matrix. We applied this algorithm called EdgeCrafting to a bulk RNA-sequencing gene expression matrix comprised of a healthy kidney and cancerous kidney data. We then compared EdgeCrafting against 4 other RNA expression analysis techniques: Weighted Gene Correlation Network Analysis, Knowledge Independent Network Construction, NetExtractor, and Differential gene expression analysis. 
    more » « less