skip to main content


Title: Single-Cell Multi-Modal GAN (scMMGAN) reveals spatial patterns in single-cell data from triple negative breast cancer
Exciting advances in technologies to measure biological systems are currently at the forefront of research. The ability to gather data along an increasing number of omic dimensions has created a need for tools to analyze all of this information together, rather than siloing each technology into separate analysis pipelines. To advance this goal, we introduce a framework called the Single-Cell Multi-Modal GAN (scMMGAN) that integrates data from multiple modalities into a unified representation in the ambient data space for downstream analysis using a combination of adversarial learning and data geometry techniques. The framework’s key improvement is an additional diffusion geometry loss with a new kernel that constrains the otherwise over-parameterized GAN network. We demonstrate scMMGAN’s ability to produce more meaningful alignments than alternative methods on a wide variety of data modalities, and that its output can be used to draw conclusions from real-world biological experimental data. We highlight data from an experiment studying the development of triple negative breast cancer, where we show how scMMGAN can be used to identify novel gene associations and we demonstrate that cell clusters identified only on the scRNAseq data occur in localized spatial patterns that reveal insights on the spatial transcriptomic images.  more » « less
Award ID(s):
2047856
NSF-PAR ID:
10352699
Author(s) / Creator(s):
Date Published:
Journal Name:
Patterns
ISSN:
2666-3899
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Glass nanopipettes have shown promise for applications in single-cell manipulation, analysis, and imaging. In recent years, plasmonic nanopipettes have been developed to enable surface-enhanced Raman spectroscopy (SERS) measurements for single-cell analysis. In this work, we developed a SERS-active nanopipette that can be used to perform long-term and reliable intracellular analysis of single living cells with minimal damage, which is achieved by optimizing the nanopipette geometry and the surface density of the gold nanoparticle (AuNP) layer at the nanopipette tip. To demonstrate its ability in single-cell analysis, we used the nanopipette for intracellular pH sensing. Intracellular pH (pH i ) is vital to cells as it influences cell function and behavior and pathological conditions. The pH sensitivity was realized by simply modifying the AuNP layer with the pH reporter molecule 4-mercaptobenzoic acid. With a response time of less than 5 seconds, the pH sensing range is from 6.0 to 8.0 and the maximum sensitivity is 0.2 pH units. We monitored the pH i change of individual HeLa and fibroblast cells, triggered by the extracellular pH (pH e ) change. The HeLa cancer cells can better resist pH e change and adapt to the weak acidic environment. Plasmonic nanopipettes can be further developed to monitor other intracellular biomarkers. 
    more » « less
  2. Abstract Motivation

    Single-cell RNA sequencing (scRNAseq) technologies allow for measurements of gene expression at a single-cell resolution. This provides researchers with a tremendous advantage for detecting heterogeneity, delineating cellular maps or identifying rare subpopulations. However, a critical complication remains: the low number of single-cell observations due to limitations by rarity of subpopulation, tissue degradation or cost. This absence of sufficient data may cause inaccuracy or irreproducibility of downstream analysis. In this work, we present Automated Cell-Type-informed Introspective Variational Autoencoder (ACTIVA): a novel framework for generating realistic synthetic data using a single-stream adversarial variational autoencoder conditioned with cell-type information. Within a single framework, ACTIVA can enlarge existing datasets and generate specific subpopulations on demand, as opposed to two separate models [such as single-cell GAN (scGAN) and conditional scGAN (cscGAN)]. Data generation and augmentation with ACTIVA can enhance scRNAseq pipelines and analysis, such as benchmarking new algorithms, studying the accuracy of classifiers and detecting marker genes. ACTIVA will facilitate analysis of smaller datasets, potentially reducing the number of patients and animals necessary in initial studies.

    Results

    We train and evaluate models on multiple public scRNAseq datasets. In comparison to GAN-based models (scGAN and cscGAN), we demonstrate that ACTIVA generates cells that are more realistic and harder for classifiers to identify as synthetic which also have better pair-wise correlation between genes. Data augmentation with ACTIVA significantly improves classification of rare subtypes (more than 45% improvement compared with not augmenting and 4% better than cscGAN) all while reducing run-time by an order of magnitude in comparison to both models.

    Availability and implementation

    The codes and datasets are hosted on Zenodo (https://doi.org/10.5281/zenodo.5879639). Tutorials are available at https://github.com/SindiLab/ACTIVA.

    Supplementary information

    Supplementary data are available at Bioinformatics online.

     
    more » « less
  3. Abstract

    Neural communication networks form the fundamental basis for brain function. These communication networks are enabled by emitted ligands such as neurotransmitters, which activate receptor complexes to facilitate communication. Thus, neural communication is fundamentally dependent on the transcriptome. Here we develop NeuronChat, a method and package for the inference, visualization and analysis of neural-specific communication networks among pre-defined cell groups using single-cell expression data. We incorporate a manually curated molecular interaction database of neural signaling for both human and mouse, and benchmark NeuronChat on several published datasets to validate its ability in predicting neural connectivity. Then, we apply NeuronChat to three different neural tissue datasets to illustrate its functionalities in identifying interneural communication networks, revealing conserved or context-specific interactions across different biological contexts, and predicting communication pattern changes in diseased brains with autism spectrum disorder. Finally, we demonstrate NeuronChat can utilize spatial transcriptomics data to infer and visualize neural-specific cell-cell communication.

     
    more » « less
  4. Abstract

    Imaging flow cytometry (IFC) combines flow cytometry and fluorescence microscopy to enable high-throughput, multiparametric single-cell analysis with rich spatial details. However, current IFC techniques remain limited in their ability to reveal subcellular information with a high 3D resolution, throughput, sensitivity, and instrumental simplicity. In this study, we introduce a light-field flow cytometer (LFC), an IFC system capable of high-content, single-shot, and multi-color acquisition of up to 5,750 cells per second with a near-diffraction-limited resolution of 400-600 nm in all three dimensions. The LFC system integrates optical, microfluidic, and computational strategies to facilitate the volumetric visualization of various 3D subcellular characteristics through convenient access to commonly used epi-fluorescence platforms. We demonstrate the effectiveness of LFC in assaying, analyzing, and enumerating intricate subcellular morphology, function, and heterogeneity using various phantoms and biological specimens. The advancement offered by the LFC system presents a promising methodological pathway for broad cell biological and translational discoveries, with the potential for widespread adoption in biomedical research.

     
    more » « less
  5. Coelho, Luis Pedro (Ed.)
    Improvements in microscopy software and hardware have dramatically increased the pace of image acquisition, making analysis a major bottleneck in generating quantitative, single-cell data. Although tools for segmenting and tracking bacteria within time-lapse images exist, most require human input, are specialized to the experimental set up, or lack accuracy. Here, we introduce DeLTA 2.0, a purely Python workflow that can rapidly and accurately analyze images of single cells on two-dimensional surfaces to quantify gene expression and cell growth. The algorithm uses deep convolutional neural networks to extract single-cell information from time-lapse images, requiring no human input after training. DeLTA 2.0 retains all the functionality of the original version, which was optimized for bacteria growing in the mother machine microfluidic device, but extends results to two-dimensional growth environments. Two-dimensional environments represent an important class of data because they are more straightforward to implement experimentally, they offer the potential for studies using co-cultures of cells, and they can be used to quantify spatial effects and multi-generational phenomena. However, segmentation and tracking are significantly more challenging tasks in two-dimensions due to exponential increases in the number of cells. To showcase this new functionality, we analyze mixed populations of antibiotic resistant and susceptible cells, and also track pole age and growth rate across generations. In addition to the two-dimensional capabilities, we also introduce several major improvements to the code that increase accessibility, including the ability to accept many standard microscopy file formats as inputs and the introduction of a Google Colab notebook so users can try the software without installing the code on their local machine. DeLTA 2.0 is rapid, with run times of less than 10 minutes for complete movies with hundreds of cells, and is highly accurate, with error rates around 1%, making it a powerful tool for analyzing time-lapse microscopy data. 
    more » « less