skip to main content

Title: Characterizing spatial gene expression heterogeneity in spatially resolved single-cell transcriptomic data with nonuniform cellular densities
Recent technological advances have enabled spatially resolved measurements of expression profiles for hundreds to thousands of genes in fixed tissues at single-cell resolution. However, scalable computational analysis methods able to take into consideration the inherent 3D spatial organization of cell types and nonuniform cellular densities within tissues are still lacking. To address this, we developed MERINGUE, a computational framework based on spatial autocorrelation and cross-correlation analysis to identify genes with spatially heterogeneous expression patterns, infer putative cell–cell communication, and perform spatially informed cell clustering in 2D and 3D in a density-agnostic manner using spatially resolved transcriptomic data. We applied MERINGUE to a variety of spatially resolved transcriptomic data sets including multiplexed error-robust fluorescence in situ hybridization (MERFISH), spatial transcriptomics, Slide-seq, and aligned in situ hybridization (ISH) data. We anticipate that such statistical analysis of spatially resolved transcriptomic data will facilitate our understanding of the interplay between cell state and spatial organization in tissue development and disease.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Genome Research
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Motivation

    Spatially resolved single-cell transcriptomics have provided unprecedented insights into gene expression in situ, particularly in the context of cell interactions or organization of tissues. However, current technologies for profiling spatial gene expression at single-cell resolution are generally limited to the measurement of a small number of genes. To address this limitation, several algorithms have been developed to impute or predict the expression of additional genes that were not present in the measured gene panel. Current algorithms do not leverage the rich spatial and gene relational information in spatial transcriptomics. To improve spatial gene expression predictions, we introduce Spatial Propagation and Reinforcement of Imputed Transcript Expression (SPRITE) as a meta-algorithm that processes predictions obtained from existing methods by propagating information across gene correlation networks and spatial neighborhood graphs.


    SPRITE improves spatial gene expression predictions across multiple spatial transcriptomics datasets. Furthermore, SPRITE predicted spatial gene expression leads to improved clustering, visualization, and classification of cells. SPRITE can be used in spatial transcriptomics data analysis to improve inferences based on predicted gene expression.

    Availability and implementation

    The SPRITE software package is available at Code for generating experiments and analyses in the manuscript is available at

    more » « less
  2. Abstract

    Tissue development and disease lead to changes in cellular organization, nuclear morphology, and gene expression, which can be jointly measured by spatial transcriptomic technologies. However, methods for jointly analyzing the different spatial data modalities in 3D are still lacking. We present a computational framework to integrate Spatial Transcriptomic data using over-parameterized graph-based Autoencoders with Chromatin Imaging data (STACI) to identify molecular and functional alterations in tissues. STACI incorporates multiple modalities in a single representation for downstream tasks, enables the prediction of spatial transcriptomic data from nuclear images in unseen tissue sections, and provides built-in batch correction of gene expression and tissue morphology through over-parameterization. We apply STACI to analyze the spatio-temporal progression of Alzheimer’s disease and identify the associated nuclear morphometric and coupled gene expression features. Collectively, we demonstrate the importance of characterizing disease progression by integrating multiple data modalities and its potential for the discovery of disease biomarkers.

    more » « less
  3. Spatially resolved scRNA-seq (sp-scRNA-seq) technologies provide the potential to comprehensively profile gene expression patterns in tissue context. However, the development of computational methods lags behind the advances in these technologies, which limits the fulfillment of their potential. In this study, we develop a deep learning approach for clustering sp-scRNA-seq data, named Deep Spatially constrained Single-cell Clustering (DSSC). In this model, we integrate the spatial information of cells into the clustering process in two steps: (1) the spatial information is encoded by using a graphical neural network model, and (2) cell-to-cell constraints are built based on the spatial expression pattern of the marker genes and added in the model to guide the clustering process. Then, a deep embedding clustering is performed on the bottleneck layer of autoencoder by Kullback–Leibler (KL) divergence along with the learning of feature representation. DSSC is the first model that can use information from both spatial coordinates and marker genes to guide cell/spot clustering. Extensive experiments on both simulated and real data sets show that DSSC boosts clustering performance significantly compared with the state-of-the-art methods. It has robust performance across different data sets with various cell type/tissue organization and/or cell type/tissue spatial dependency. We conclude that DSSC is a promising tool for clustering sp-scRNA-seq data. 
    more » « less
  4. Abstract

    Spatially resolved genomic technologies have allowed us to study the physical organization of cells and tissues, and promise an understanding of local interactions between cells. However, it remains difficult to precisely align spatial observations across slices, samples, scales, individuals and technologies. Here, we propose a probabilistic model that aligns spatially-resolved samples onto a known or unknown common coordinate system (CCS) with respect to phenotypic readouts (for example, gene expression). Our method, Gaussian Process Spatial Alignment (GPSA), consists of a two-layer Gaussian process: the first layer maps observed samples’ spatial locations onto a CCS, and the second layer maps from the CCS to the observed readouts. Our approach enables complex downstream spatially aware analyses that are impossible or inaccurate with unaligned data, including an analysis of variance, creation of a dense three-dimensional (3D) atlas from sparse two-dimensional (2D) slices or association tests across data modalities.

    more » « less
  5. Seeds, which provide a major source of calories for humans, are a unique stage of a flowering plant’s lifecycle. During seed germination the embryo reactivates rapidly and goes through major developmental transitions to become a seedling. This requires extensive and complex spatiotemporal coordination of cell and tissue activity. Existing gene expression profiling methods, such as laser capture microdissection followed by RNA-seq and single-cell RNA7 seq, suffer from either low throughput or the loss of spatial information about the cells analysed. Spatial transcriptomics methods couple high throughput analysis of gene expression simultaneously with the ability to record the spatial location of each individual region analysed. We developed a spatial transcriptomics workflow for germinating barley grain to better understand the spatiotemporal control of gene expression within individual seed cell types. More than 14,000 genes were differentially regulated across 0, 1, 3, 6 and 24 hours after imbibition. This approach enabled us to observe that many functional categories displayed specific spatial expression patterns that could be resolved at a sub-tissue level. Individual aquaporin gene family members, important for water and ion transport, had specific spatial expression patterns over time, as well as genes related to cell wall modification, membrane transport and transcription factors. Using spatial autocorrelation algorithms, we were able to identify auxin transport genes that had increasingly focused expression within subdomains of the embryo over germination time, suggestive of a role in establishment of the embryo axis. Together, our data provides an unprecedented spatially resolved cellular map for barley grain germination and specific genes to target for functional genomics to define cellular restricted processes in tissues during germination. The data can be viewed at 
    more » « less