skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM to 12:00 PM ET on Tuesday, March 25 due to maintenance. We apologize for the inconvenience.


Title: Predicting spatially resolved gene expression via tissue morphology using adaptive spatial GNNs
Abstract MotivationSpatial transcriptomics technologies, which generate a spatial map of gene activity, can deepen the understanding of tissue architecture and its molecular underpinnings in health and disease. However, the high cost makes these technologies difficult to use in practice. Histological images co-registered with targeted tissues are more affordable and routinely generated in many research and clinical studies. Hence, predicting spatial gene expression from the morphological clues embedded in tissue histological images provides a scalable alternative approach to decoding tissue complexity. ResultsHere, we present a graph neural network based framework to predict the spatial expression of highly expressed genes from tissue histological images. Extensive experiments on two separate breast cancer data cohorts demonstrate that our method improves the prediction performance compared to the state-of-the-art, and that our model can be used to better delineate spatial domains of biological interest. Availability and implementationhttps://github.com/song0309/asGNN/  more » « less
Award ID(s):
2042159
PAR ID:
10540061
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Bioinformatics
Volume:
40
Issue:
Supplement_2
ISSN:
1367-4803
Format(s):
Medium: X Size: p. ii111-ii119
Size(s):
p. ii111-ii119
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Histopathological images are used to characterize complex phenotypes such as tumor stage. Our goal is to associate features of stained tissue images with high-dimensional genomic markers. We use convolutional autoencoders and sparse canonical correlation analysis (CCA) on paired histological images and bulk gene expression to identify subsets of genes whose expression levels in a tissue sample correlate with subsets of morphological features from the corresponding sample image. We apply our approach, ImageCCA, to two TCGA data sets, and find gene sets associated with the structure of the extracellular matrix and cell wall infrastructure, implicating uncharacterized genes in extracellular processes. We find sets of genes associated with specific cell types, including neuronal cells and cells of the immune system. We apply ImageCCA to the GTEx v6 data, and find image features that capture population variation in thyroid and in colon tissues associated with genetic variants (image morphology QTLs, or imQTLs), suggesting that genetic variation regulates population variation in tissue morphological traits. 
    more » « less
  2. Abstract Tissue development and disease lead to changes in cellular organization, nuclear morphology, and gene expression, which can be jointly measured by spatial transcriptomic technologies. However, methods for jointly analyzing the different spatial data modalities in 3D are still lacking. We present a computational framework to integrate Spatial Transcriptomic data using over-parameterized graph-based Autoencoders with Chromatin Imaging data (STACI) to identify molecular and functional alterations in tissues. STACI incorporates multiple modalities in a single representation for downstream tasks, enables the prediction of spatial transcriptomic data from nuclear images in unseen tissue sections, and provides built-in batch correction of gene expression and tissue morphology through over-parameterization. We apply STACI to analyze the spatio-temporal progression of Alzheimer’s disease and identify the associated nuclear morphometric and coupled gene expression features. Collectively, we demonstrate the importance of characterizing disease progression by integrating multiple data modalities and its potential for the discovery of disease biomarkers. 
    more » « less
  3. Abstract BackgroundIn the past few years, there has been an explosion in single-cell transcriptomics datasets, yet in vivo confirmation of these datasets is hampered in plants due to lack of robust validation methods. Likewise, modeling of plant development is hampered by paucity of spatial gene expression data. RNA fluorescence in situ hybridization (FISH) enables investigation of gene expression in the context of tissue type. Despite development of FISH methods for plants, easy and reliable whole mount FISH protocols have not yet been reported. ResultsWe adapt a 3-day whole mount RNA-FISH method for plant species based on a combination of prior protocols that employs hybridization chain reaction (HCR), which amplifies the probe signal in an antibody-free manner. Our whole mount HCR RNA-FISH method shows expected spatial signals with low background for gene transcripts with known spatial expression patterns in Arabidopsis inflorescences and monocot roots. It allows simultaneous detection of three transcripts in 3D. We also show that HCR RNA-FISH can be combined with endogenous fluorescent protein detection and with our improved immunohistochemistry (IHC) protocol. ConclusionsThe whole mount HCR RNA-FISH and IHC methods allow easy investigation of 3D spatial gene expression patterns in entire plant tissues. 
    more » « less
  4. Abstract Recent technologies such asspatial transcriptomics, enable the measurement of gene expressions at the single-cell level along with the spatial locations of these cells in the tissue. Spatial clustering of the cells provides valuable insights into the understanding of the functional organization of the tissue. However, most such clustering methods involve some dimension reduction that leads to a loss of the inherent dependency structure among genes at any spatial location in the tissue. This destroys valuable insights of gene co-expression patterns apart from possibly impacting spatial clustering performance. In spatial transcriptomics, the matrix-variate gene expression data, along with spatial coordinates of the single cells, provides information on both gene expression dependencies and cell spatial dependencies through its row and column covariances. In this work, we propose a joint Bayesian approach to simultaneously estimate these gene and spatial cell correlations. These estimates provide data summaries for downstream analyses. We illustrate our method with simulations and analysis of several real spatial transcriptomic datasets. Our work elucidates gene co-expression networks as well as clear spatial clustering patterns of the cells. Furthermore, our analysis reveals that downstream spatial-differential analysis may aid in the discovery of unknown cell types from known marker genes. 
    more » « less
  5. Abstract BackgroundGenomic safe harbors are regions of the genome that can maintain transgene expression without disrupting the function of host cells. Genomic safe harbors play an increasingly important role in improving the efficiency and safety of genome engineering. However, limited safe harbors have been identified. ResultsHere, we develop a framework to facilitate searches for genomic safe harbors by integrating information from polymorphic mobile element insertions that naturally occur in human populations, epigenomic signatures, and 3D chromatin organization. By applying our framework to polymorphic mobile element insertions identified in the 1000 Genomes project and the Genotype-Tissue Expression (GTEx) project, we identify 19 candidate safe harbors in blood cells and 5 in brain cells. For three candidate sites in blood, we demonstrate the stable expression of transgene without disrupting nearby genes in host erythroid cells. We also develop a computer program, Genomics and Epigenetic Guided Safe Harbor mapper (GEG-SH mapper), for knowledge-based tissue-specific genomic safe harbor selection. ConclusionsOur study provides a new knowledge-based framework to identify tissue-specific genomic safe harbors. In combination with the fast-growing genome engineering technologies, our approach has the potential to improve the overall safety and efficiency of gene and cell-based therapy in the near future. 
    more » « less