skip to main content


Title: SpaceX: gene co-expression network estimation for spatial transcriptomics
Abstract Motivation

The analysis of spatially resolved transcriptome enables the understanding of the spatial interactions between the cellular environment and transcriptional regulation. In particular, the characterization of the gene–gene co-expression at distinct spatial locations or cell types in the tissue enables delineation of spatial co-regulatory patterns as opposed to standard differential single gene analyses. To enhance the ability and potential of spatial transcriptomics technologies to drive biological discovery, we develop a statistical framework to detect gene co-expression patterns in a spatially structured tissue consisting of different clusters in the form of cell classes or tissue domains.

Results

We develop SpaceX (spatially dependent gene co-expression network), a Bayesian methodology to identify both shared and cluster-specific co-expression network across genes. SpaceX uses an over-dispersed spatial Poisson model coupled with a high-dimensional factor model which is based on a dimension reduction technique for computational efficiency. We show via simulations, accuracy gains in co-expression network estimation and structure by accounting for (increasing) spatial correlation and appropriate noise distributions. In-depth analysis of two spatial transcriptomics datasets in mouse hypothalamus and human breast cancer using SpaceX, detected multiple hub genes which are related to cognitive abilities for the hypothalamus data and multiple cancer genes (e.g. collagen family) from the tumor region for the breast cancer data.

Availability and implementation

The SpaceX R-package is available at github.com/bayesrx/SpaceX.

Supplementary information

Supplementary data are available at Bioinformatics online.

 
more » « less
PAR ID:
10380259
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Bioinformatics
Volume:
38
Issue:
22
ISSN:
1367-4803
Page Range / eLocation ID:
p. 5033-5041
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Martelli, Pier Luigi (Ed.)
    Abstract Motivation Clustering spatial-resolved gene expression is an essential analysis to reveal gene activities in the underlying morphological context by their functional roles. However, conventional clustering analysis does not consider gene expression co-localizations in tissue for detecting spatial expression patterns or functional relationships among the genes for biological interpretation in the spatial context. In this article, we present a convolutional neural network (CNN) regularized by the graph of protein–protein interaction (PPI) network to cluster spatially resolved gene expression. This method improves the coherence of spatial patterns and provides biological interpretation of the gene clusters in the spatial context by exploiting the spatial localization by convolution and gene functional relationships by graph-Laplacian regularization. Results In this study, we tested clustering the spatially variable genes or all expressed genes in the transcriptome in 22 Visium spatial transcriptomics datasets of different tissue sections publicly available from 10× Genomics and spatialLIBD. The results demonstrate that the PPI-regularized CNN constantly detects gene clusters with coherent spatial patterns and significantly enriched by gene functions with the state-of-the-art performance. Additional case studies on mouse kidney tissue and human breast cancer tissue suggest that the PPI-regularized CNN also detects spatially co-expressed genes to define the corresponding morphological context in the tissue with valuable insights. Availability and implementation Source code is available at https://github.com/kuanglab/CNN-PReg. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  2. Abstract Motivation

    Spatial transcriptomics technologies, which generate a spatial map of gene activity, can deepen the understanding of tissue architecture and its molecular underpinnings in health and disease. However, the high cost makes these technologies difficult to use in practice. Histological images co-registered with targeted tissues are more affordable and routinely generated in many research and clinical studies. Hence, predicting spatial gene expression from the morphological clues embedded in tissue histological images provides a scalable alternative approach to decoding tissue complexity.

    Results

    Here, we present a graph neural network based framework to predict the spatial expression of highly expressed genes from tissue histological images. Extensive experiments on two separate breast cancer data cohorts demonstrate that our method improves the prediction performance compared to the state-of-the-art, and that our model can be used to better delineate spatial domains of biological interest.

    Availability and implementation

    https://github.com/song0309/asGNN/

     
    more » « less
  3. Abstract

    Spatial transcriptomics data play a crucial role in cancer research, providing a nuanced understanding of the spatial organization of gene expression within tumor tissues. Unraveling the spatial dynamics of gene expression can unveil key insights into tumor heterogeneity and aid in identifying potential therapeutic targets. However, in many large-scale cancer studies, spatial transcriptomics data are limited, with bulk RNA-seq and corresponding Whole Slide Image (WSI) data being more common (e.g. TCGA project). To address this gap, there is a critical need to develop methodologies that can estimate gene expression at near-cell (spot) level resolution from existing WSI and bulk RNA-seq data. This approach is essential for reanalyzing expansive cohort studies and uncovering novel biomarkers that have been overlooked in the initial assessments. In this study, we present STGAT (Spatial Transcriptomics Graph Attention Network), a novel approach leveraging Graph Attention Networks (GAT) to discern spatial dependencies among spots. Trained on spatial transcriptomics data, STGAT is designed to estimate gene expression profiles at spot-level resolution and predict whether each spot represents tumor or non-tumor tissue, especially in patient samples where only WSI and bulk RNA-seq data are available. Comprehensive tests on two breast cancer spatial transcriptomics datasets demonstrated that STGAT outperformed existing methods in accurately predicting gene expression. Further analyses using the TCGA breast cancer dataset revealed that gene expression estimated from tumor-only spots (predicted by STGAT) provides more accurate molecular signatures for breast cancer sub-type and tumor stage prediction, and also leading to improved patient survival and disease-free analysis. Availability: Code is available at https://github.com/compbiolabucf/STGAT.

     
    more » « less
  4. Background

    Gene co‐expression and differential co‐expression analysis has been increasingly used to study co‐functional and co‐regulatory biological mechanisms from large scale transcriptomics data sets.

    Methods

    In this study, we develop a nonparametric approach to identify hub genes and modules in a large co‐expression network with low computational and memory cost, namely MRHCA.

    Results

    We have applied the method to simulated transcriptomics data sets and demonstrated MRHCA can accurately identify hub genes and estimate size of co‐expression modules. With applying MRHCA and differential co‐expression analysis toE. coliand TCGA cancer data, we have identified significant condition specific activated genes inE. coliand distinct gene expression regulatory mechanisms between the cancer types with high copy number variation and small somatic mutations.

    Conclusion

    Our analysis has demonstrated MRHCA can (i) deal with large association networks, (ii) rigorously assess statistical significance for hubs and module sizes, (iii) identify co‐expression modules with low associations, (iv) detect small and significant modules, and (v) allow genes to be present in more than one modules, compared with existing methods.

     
    more » « less
  5. Abstract

    Recent technologies such asspatial transcriptomics, enable the measurement of gene expressions at the single-cell level along with the spatial locations of these cells in the tissue. Spatial clustering of the cells provides valuable insights into the understanding of the functional organization of the tissue. However, most such clustering methods involve some dimension reduction that leads to a loss of the inherent dependency structure among genes at any spatial location in the tissue. This destroys valuable insights of gene co-expression patterns apart from possibly impacting spatial clustering performance. In spatial transcriptomics, the matrix-variate gene expression data, along with spatial coordinates of the single cells, provides information on both gene expression dependencies and cell spatial dependencies through its row and column covariances. In this work, we propose a joint Bayesian approach to simultaneously estimate these gene and spatial cell correlations. These estimates provide data summaries for downstream analyses. We illustrate our method with simulations and analysis of several real spatial transcriptomic datasets. Our work elucidates gene co-expression networks as well as clear spatial clustering patterns of the cells. Furthermore, our analysis reveals that downstream spatial-differential analysis may aid in the discovery of unknown cell types from known marker genes.

     
    more » « less