Abstract Single-cell RNA-sequencing (scRNA-Seq) is widely used to reveal the heterogeneity and dynamics of tissues, organisms, and complex diseases, but its analyses still suffer from multiple grand challenges, including the sequencing sparsity and complex differential patterns in gene expression. We introduce the scGNN (single-cell graph neural network) to provide a hypothesis-free deep learning framework for scRNA-Seq analyses. This framework formulates and aggregates cell–cell relationships with graph neural networks and models heterogeneous gene expression patterns using a left-truncated mixture Gaussian model. scGNN integrates three iterative multi-modal autoencoders and outperforms existing tools for gene imputation and cell clustering on four benchmark scRNA-Seq datasets. In an Alzheimer’s disease study with 13,214 single nuclei from postmortem brain tissues, scGNN successfully illustrated disease-related neural development and the differential mechanism. scGNN provides an effective representation of gene expression and cell–cell relationships. It is also a powerful framework that can be applied to general scRNA-Seq analyses.
more »
« less
Deep model predictive control of gene expression in thousands of single cells
Gene expression is inherently dynamic, due to complex regulation and stochastic biochemical events. However, the effects of these dynamics on cell phenotypes can be difficult to determine. Researchers have historically been limited to passive observations of natural dynamics, which can preclude studies of elusive and noisy cellular events where large amounts of data are required to reveal statistically significant effects. Here, using recent advances in the fields of machine learning and control theory, we train a deep neural network to accurately predict the response of an optogenetic system inEscherichia colicells. We then use the network in a deep model predictive control framework to impose arbitrary and cell-specific gene expression dynamics on thousands of single cells in real time, applying the framework to generate complex time-varying patterns. We also showcase the framework’s ability to link expression patterns to dynamic functional outcomes by controlling expression of thetetAantibiotic resistance gene. This study highlights how deep learning-enabled feedback control can be used to tailor distributions of gene expression dynamics with high accuracy and throughput without expert knowledge of the biological system.
more »
« less
- Award ID(s):
- 2032357
- PAR ID:
- 10556446
- Publisher / Repository:
- Nature Communications
- Date Published:
- Journal Name:
- Nature Communications
- Volume:
- 15
- Issue:
- 1
- ISSN:
- 2041-1723
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Csikász-Nagy, Attila (Ed.)Large programs of dynamic gene expression, like cell cyles and circadian rhythms, are controlled by a relatively small “core” network of transcription factors and post-translational modifiers, working in concerted mutual regulation. Recent work suggests that system-independent, quantitative features of the dynamics of gene expression can be used to identify core regulators. We introduce an approach of iterative network hypothesis reduction from time-series data in which increasingly complex features of the dynamic expression of individual, pairs, and entire collections of genes are used to infer functional network models that can produce the observed transcriptional program. The culmination of our work is a computational pipeline, I terative N etwork H ypoth e sis Re ductio n from T emporal Dynamics (Inherent dynamics pipeline), that provides a priority listing of targets for genetic perturbation to experimentally infer network structure. We demonstrate the capability of this integrated computational pipeline on synthetic and yeast cell-cycle data.more » « less
-
Over the last decade, multiple studies have shown that signaling proteins activated in different temporal patterns, such as oscillatory, transient, and sustained, can result in distinct gene expression patterns or cell fates. However, the molecular events that ensure appropriate stimulus- and dose-dependent dynamics are not often understood and are difficult to investigate. Here, we used single-cell analysis to dissect the mechanisms underlying the stimulus- and dose-encoding patterns in the innate immune signaling network. We found that Toll-like receptor (TLR) and interleukin-1 receptor (IL-1R) signaling dynamics relied on a dose-dependent, autoinhibitory loop that rendered cells refractory to further stimulation. Using inducible gene expression and optogenetics to perturb the network at different levels, we identified IL-1R–associated kinase 1 (IRAK1) as the dose-sensing node responsible for limiting signal flow during the innate immune response. Although the kinase activity of IRAK1 was not required for signal propagation, it played a critical role in inhibiting the nucleocytoplasmic oscillations of the transcription factor NF-κB. Thus, protein activities that may be “dispensable” from a topological perspective can nevertheless be essential in shaping the dynamic response to the external environment.more » « less
-
Abstract Recently, lineage tracing technology using CRISPR/Cas9 genome editing has enabled simultaneous readouts of gene expressions and lineage barcodes, which allows for the reconstruction of the cell division tree and makes it possible to reconstruct ancestral cell types and trace the origin of each cell type. Meanwhile, trajectory inference methods are widely used to infer cell trajectories and pseudotime in a dynamic process using gene expression data of present-day cells. Here, we present TedSim (single-cell temporal dynamics simulator), which simulates the cell division events from the root cell to present-day cells, simultaneously generating two data modalities for each single cell: the lineage barcode and gene expression data. TedSim is a framework that connects the two problems: lineage tracing and trajectory inference. Using TedSim, we conducted analysis to show that (i) TedSim generates realistic gene expression and barcode data, as well as realistic relationships between these two data modalities; (ii) trajectory inference methods can recover the underlying cell state transition mechanism with balanced cell type compositions; and (iii) integrating gene expression and barcode data can provide more insights into the temporal dynamics in cell differentiation compared to using only one type of data, but better integration methods need to be developed.more » « less
-
Single-cell genomic technologies offer vast new resources with which to study cells, but their potential to inform parameter inference of cell dynamics has yet to be fully realized. Here we develop methods for Bayesian parameter inference with data that jointly measure gene expression and Ca2+dynamics in single cells. We propose to share information between cells via transfer learning: for a sequence of cells, the posterior distribution of one cell is used to inform the prior distribution of the next. In application to intracellular Ca2+signalling dynamics, we fit the parameters of a dynamical model for thousands of cells with variable single-cell responses. We show that transfer learning accelerates inference with sequences of cells regardless of how the cells are ordered. However, only by ordering cells based on their transcriptional similarity can we distinguish Ca2+dynamic profiles and associated marker genes from the posterior distributions. Inference results reveal complex and competing sources of cell heterogeneity: parameter covariation can diverge between the intracellular and intercellular contexts. Overall, we discuss the extent to which single-cell parameter inference informed by transcriptional similarity can quantify relationships between gene expression states and signalling dynamics in single cells.more » « less
An official website of the United States government

