skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Single-Cell Multi-Modal GAN (scMMGAN) reveals spatial patterns in single-cell data from triple negative breast cancer
Exciting advances in technologies to measure biological systems are currently at the forefront of research. The ability to gather data along an increasing number of omic dimensions has created a need for tools to analyze all of this information together, rather than siloing each technology into separate analysis pipelines. To advance this goal, we introduce a framework called the Single-Cell Multi-Modal GAN (scMMGAN) that integrates data from multiple modalities into a unified representation in the ambient data space for downstream analysis using a combination of adversarial learning and data geometry techniques. The framework’s key improvement is an additional diffusion geometry loss with a new kernel that constrains the otherwise over-parameterized GAN network. We demonstrate scMMGAN’s ability to produce more meaningful alignments than alternative methods on a wide variety of data modalities, and that its output can be used to draw conclusions from real-world biological experimental data. We highlight data from an experiment studying the development of triple negative breast cancer, where we show how scMMGAN can be used to identify novel gene associations and we demonstrate that cell clusters identified only on the scRNAseq data occur in localized spatial patterns that reveal insights on the spatial transcriptomic images.  more » « less
Award ID(s):
2047856
PAR ID:
10352699
Author(s) / Creator(s):
Date Published:
Journal Name:
Patterns
ISSN:
2666-3899
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Although single-cell and spatial sequencing methods enable simultaneous measurement of more than one biological modality, no technology can capture all modalities within the same cell. For current data integration methods, the feasibility of cross-modal integration relies on the existence of highly correlated, a priori ‘linked’ features. We describe matching X-modality via fuzzy smoothed embedding (MaxFuse), a cross-modal data integration method that, through iterative coembedding, data smoothing and cell matching, uses all information in each modality to obtain high-quality integration even when features are weakly linked. MaxFuse is modality-agnostic and demonstrates high robustness and accuracy in the weak linkage scenario, achieving 20~70% relative improvement over existing methods under key evaluation metrics on benchmarking datasets. A prototypical example of weak linkage is the integration of spatial proteomic data with single-cell sequencing data. On two example analyses of this type, MaxFuse enabled the spatial consolidation of proteomic, transcriptomic and epigenomic information at single-cell resolution on the same tissue section. 
    more » « less
  2. Multi-modal single cell RNA assays capture RNA content as well as other data modalities, such as spatial cell position or the electrophysiological properties of cells. Compared to dedicated scRNA-seq assays however, they may unintentionally capture RNA from multiple adjacent cells, exhibit lower RNA sequencing depth compared to scRNA-seq, or lack genome-wide RNA measurements. We present scProjection, a method for mapping individual multi-modal RNA measurements to deeply sequenced scRNA-seq atlases to extract cell type-specific, single cell gene expression profiles. We demonstrate several use cases of scProjection, including the identification of spatial motifs from spatial transcriptome assays, distinguishing RNA contributions from neighboring cells in both spatial and multi-modal single cell assays, and imputing expression measurements of un-measured genes from gene markers. scProjection therefore combines the advantages of both multi-modal and scRNA-seq assays to yield precise multi-modal measurements of single cells. 
    more » « less
  3. Abstract Recently, lineage tracing technology using CRISPR/Cas9 genome editing has enabled simultaneous readouts of gene expressions and lineage barcodes, which allows for the reconstruction of the cell division tree and makes it possible to reconstruct ancestral cell types and trace the origin of each cell type. Meanwhile, trajectory inference methods are widely used to infer cell trajectories and pseudotime in a dynamic process using gene expression data of present-day cells. Here, we present TedSim (single-cell temporal dynamics simulator), which simulates the cell division events from the root cell to present-day cells, simultaneously generating two data modalities for each single cell: the lineage barcode and gene expression data. TedSim is a framework that connects the two problems: lineage tracing and trajectory inference. Using TedSim, we conducted analysis to show that (i) TedSim generates realistic gene expression and barcode data, as well as realistic relationships between these two data modalities; (ii) trajectory inference methods can recover the underlying cell state transition mechanism with balanced cell type compositions; and (iii) integrating gene expression and barcode data can provide more insights into the temporal dynamics in cell differentiation compared to using only one type of data, but better integration methods need to be developed. 
    more » « less
  4. In recent years, there has been a growing interest in profiling multiomic modalities within individual cells simultaneously. One such example is integrating combined single-cell RNA sequencing (scRNA-seq) data and single-cell transposase-accessible chromatin sequencing (scATAC-seq) data. Integrated analysis of diverse modalities has helped researchers make more accurate predictions and gain a more comprehensive understanding than with single-modality analysis. However, generating such multimodal data is technically challenging and expensive, leading to limited availability of single-cell co-assay data. Here, we propose a model for cross-modal prediction between the transcriptome and chromatin profiles in single cells. Our model is based on a deep neural network architecture that learns the latent representations from the source modality and then predicts the target modality. It demonstrates reliable performance in accurately translating between these modalities across multiple paired human scATAC-seq and scRNA-seq datasets. Additionally, we developed CrossMP, a web-based portal allowing researchers to upload their single-cell modality data through an interactive web interface and predict the other type of modality data, using high-performance computing resources plugged at the backend. 
    more » « less
  5. Abstract Neural communication networks form the fundamental basis for brain function. These communication networks are enabled by emitted ligands such as neurotransmitters, which activate receptor complexes to facilitate communication. Thus, neural communication is fundamentally dependent on the transcriptome. Here we develop NeuronChat, a method and package for the inference, visualization and analysis of neural-specific communication networks among pre-defined cell groups using single-cell expression data. We incorporate a manually curated molecular interaction database of neural signaling for both human and mouse, and benchmark NeuronChat on several published datasets to validate its ability in predicting neural connectivity. Then, we apply NeuronChat to three different neural tissue datasets to illustrate its functionalities in identifying interneural communication networks, revealing conserved or context-specific interactions across different biological contexts, and predicting communication pattern changes in diseased brains with autism spectrum disorder. Finally, we demonstrate NeuronChat can utilize spatial transcriptomics data to infer and visualize neural-specific cell-cell communication. 
    more » « less