skip to main content

Title: GNTD: reconstructing spatial transcriptomes with graph-guided neural tensor decomposition informed by spatial and functional relations

Spatially-resolved RNA profiling has now been widely used to understand cells’ structural organizations and functional roles in tissues, yet it is challenging to reconstruct the whole spatial transcriptomes due to various inherent technical limitations in tissue section preparation and RNA capture and fixation in the application of the spatial RNA profiling technologies. Here, we introduce a graph-guided neural tensor decomposition (GNTD) model for reconstructing whole spatial transcriptomes in tissues. GNTD employs a hierarchical tensor structure and formulation to explicitly model the high-order spatial gene expression data with a hierarchical nonlinear decomposition in a three-layer neural network, enhanced by spatial relations among the capture spots and gene functional relations for accurate reconstruction from highly sparse spatial profiling data. Extensive experiments on 22 Visium spatial transcriptomics datasets and 3 high-resolution Stereo-seq datasets as well as simulation data demonstrate that GNTD consistently improves the imputation accuracy in cross-validations driven by nonlinear tensor decomposition and incorporation of spatial and functional information, and confirm that the imputed spatial transcriptomes provide a more complete gene expression landscape for downstream analyses of cell/spot clustering for tissue segmentation, and spatial gene expression clustering and visualizations.

more » « less
Award ID(s):
Author(s) / Creator(s):
; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Nature Communications
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Seeds, which provide a major source of calories for humans, are a unique stage of a flowering plant’s lifecycle. During seed germination the embryo reactivates rapidly and goes through major developmental transitions to become a seedling. This requires extensive and complex spatiotemporal coordination of cell and tissue activity. Existing gene expression profiling methods, such as laser capture microdissection followed by RNA-seq and single-cell RNA7 seq, suffer from either low throughput or the loss of spatial information about the cells analysed. Spatial transcriptomics methods couple high throughput analysis of gene expression simultaneously with the ability to record the spatial location of each individual region analysed. We developed a spatial transcriptomics workflow for germinating barley grain to better understand the spatiotemporal control of gene expression within individual seed cell types. More than 14,000 genes were differentially regulated across 0, 1, 3, 6 and 24 hours after imbibition. This approach enabled us to observe that many functional categories displayed specific spatial expression patterns that could be resolved at a sub-tissue level. Individual aquaporin gene family members, important for water and ion transport, had specific spatial expression patterns over time, as well as genes related to cell wall modification, membrane transport and transcription factors. Using spatial autocorrelation algorithms, we were able to identify auxin transport genes that had increasingly focused expression within subdomains of the embryo over germination time, suggestive of a role in establishment of the embryo axis. Together, our data provides an unprecedented spatially resolved cellular map for barley grain germination and specific genes to target for functional genomics to define cellular restricted processes in tissues during germination. The data can be viewed at 
    more » « less
  2. Spatial transcriptomics (ST) technologies are rapidly becoming the extension of single-cell RNA sequencing (scRNAseq), holding the potential of profiling gene expression at a single-cell resolution while maintaining cellular compositions within a tissue. Having both expression profiles and tissue organization enables researchers to better understand cellular interactions and heterogeneity, providing insight into complex biological processes that would not be possible with traditional sequencing technologies. Data generated by ST technologies are inherently noisy, high-dimensional, sparse, and multi-modal (including histological images, count matrices, etc.), thus requiring specialized computational tools for accurate and robust analysis. However, many ST studies currently utilize traditional scRNAseq tools, which are inadequate for analyzing complex ST datasets. On the other hand, many of the existing ST-specific methods are built upon traditional statistical or machine learning frameworks, which have shown to be sub-optimal in many applications due to the scale, multi-modality, and limitations of spatially resolved data (such as spatial resolution, sensitivity, and gene coverage). Given these intricacies, researchers have developed deep learning (DL)-based models to alleviate ST-specific challenges. These methods include new state-of-the-art models in alignment, spatial reconstruction, and spatial clustering, among others. However, DL models for ST analysis are nascent and remain largely underexplored. In this review, we provide an overview of existing state-of-the-art tools for analyzing spatially resolved transcriptomics while delving deeper into the DL-based approaches. We discuss the new frontiers and the open questions in this field and highlight domains in which we anticipate transformational DL applications. 
    more » « less
  3. Abstract Background

    In the past few years, there has been an explosion in single-cell transcriptomics datasets, yet in vivo confirmation of these datasets is hampered in plants due to lack of robust validation methods. Likewise, modeling of plant development is hampered by paucity of spatial gene expression data. RNA fluorescence in situ hybridization (FISH) enables investigation of gene expression in the context of tissue type. Despite development of FISH methods for plants, easy and reliable whole mount FISH protocols have not yet been reported.


    We adapt a 3-day whole mount RNA-FISH method for plant species based on a combination of prior protocols that employs hybridization chain reaction (HCR), which amplifies the probe signal in an antibody-free manner. Our whole mount HCR RNA-FISH method shows expected spatial signals with low background for gene transcripts with known spatial expression patterns in Arabidopsis inflorescences and monocot roots. It allows simultaneous detection of three transcripts in 3D. We also show that HCR RNA-FISH can be combined with endogenous fluorescent protein detection and with our improved immunohistochemistry (IHC) protocol.


    The whole mount HCR RNA-FISH and IHC methods allow easy investigation of 3D spatial gene expression patterns in entire plant tissues.

    more » « less
  4. Abstract Background

    Sepsis is a highly heterogeneous syndrome, which has hindered the development of effective therapies. This has prompted investigators to develop a precision medicine approach aimed at identifying biologically homogenous subgroups of patients with septic shock and critical illnesses. Transcriptomic analysis can identify subclasses derived from differences in underlying pathophysiological processes that may provide the basis for new targeted therapies. The goal of this study was to elucidate pathophysiological pathways and identify pediatric septic shock subclasses based on whole blood RNA expression profiles.


    The subjects were critically ill children with cardiopulmonary failure who were a part of a prospective randomized insulin titration trial to treat hyperglycemia. Genome-wide expression profiling was conducted using RNA sequencing from whole blood samples obtained from 46 children with septic shock and 52 mechanically ventilated noninfected controls without shock. Patients with septic shock were allocated to subclasses based on hierarchical clustering of gene expression profiles, and we then compared clinical characteristics, plasma inflammatory markers, cell compositions using GEDIT, and immune repertoires using Imrep between the two subclasses.


    Patients with septic shock depicted alterations in innate and adaptive immune pathways. Among patients with septic shock, we identified two subtypes based on gene expression patterns. Compared with Subclass 2, Subclass 1 was characterized by upregulation of innate immunity pathways and downregulation of adaptive immunity pathways. Subclass 1 had significantly worse clinical outcomes despite the two classes having similar illness severity on initial clinical presentation. Subclass 1 had elevated levels of plasma inflammatory cytokines and endothelial injury biomarkers and demonstrated decreased percentages of CD4 T cells and B cells and less diverse T cell receptor repertoires.


    Two subclasses of pediatric septic shock patients were discovered through genome-wide expression profiling based on whole blood RNA sequencing with major biological and clinical differences.

    Trial RegistrationThis is a secondary analysis of data generated as part of the observational CAF-PINT ancillary of the HALF-PINT study (NCT01565941). Registered March 29, 2012.

    more » « less
  5. Complex biological tissues consist of numerous cells in a highly coordinated manner and carry out various biological functions. Therefore, segmenting a tissue into spatial and functional domains is critically important for understanding and controlling the biological functions. The emerging spatial transcriptomic technologies allow simultaneous measurements of thousands of genes with precise spatial information, providing an unprecedented opportunity for dissecting biological tissues. However, how to utilize such noisy, sparse, and high dimensional data for tissue segmentation remains a major challenge. Here, we develop a deep learning-based method, named SCAN-IT by transforming the spatial domain identification problem into an image segmentation problem, with cells mimicking pixels and expression values of genes within a cell representing the color channels. Specifically, SCAN-IT relies on geometric modeling, graph neural networks, and an informatics approach, DeepGraphInfomax. We demonstrate that SCAN-IT can handle datasets from a wide range of spatial transcriptomics techniques, including the ones with high spatial resolution but low gene coverage as well as those with low spatial resolution but high gene coverage. We show that SCAN-IT outperforms state-of-the-art methods using a benchmark dataset with ground truth domain annotations. 
    more » « less