skip to main content


Title: A new method for constructing tumor specific gene co-expression networks based on samples with tumor purity heterogeneity
Abstract Motivation

Tumor tissue samples often contain an unknown fraction of stromal cells. This problem is widely known as tumor purity heterogeneity (TPH) was recently recognized as a severe issue in omics studies. Specifically, if TPH is ignored when inferring co-expression networks, edges are likely to be estimated among genes with mean shift between non-tumor- and tumor cells rather than among gene pairs interacting with each other in tumor cells. To address this issue, we propose Tumor Specific Net (TSNet), a new method which constructs tumor-cell specific gene/protein co-expression networks based on gene/protein expression profiles of tumor tissues. TSNet treats the observed expression profile as a mixture of expressions from different cell types and explicitly models tumor purity percentage in each tumor sample.

Results

Using extensive synthetic data experiments, we demonstrate that TSNet outperforms a standard graphical model which does not account for TPH. We then apply TSNet to estimate tumor specific gene co-expression networks based on TCGA ovarian cancer RNAseq data. We identify novel co-expression modules and hub structure specific to tumor cells.

Availability and implementation

R codes can be found at https://github.com/petraf01/TSNet.

Supplementary information

Supplementary data are available at Bioinformatics online.

 
more » « less
NSF-PAR ID:
10413893
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Bioinformatics
Volume:
34
Issue:
13
ISSN:
1367-4803
Page Range / eLocation ID:
p. i528-i536
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Motivation

    The analysis of spatially resolved transcriptome enables the understanding of the spatial interactions between the cellular environment and transcriptional regulation. In particular, the characterization of the gene–gene co-expression at distinct spatial locations or cell types in the tissue enables delineation of spatial co-regulatory patterns as opposed to standard differential single gene analyses. To enhance the ability and potential of spatial transcriptomics technologies to drive biological discovery, we develop a statistical framework to detect gene co-expression patterns in a spatially structured tissue consisting of different clusters in the form of cell classes or tissue domains.

    Results

    We develop SpaceX (spatially dependent gene co-expression network), a Bayesian methodology to identify both shared and cluster-specific co-expression network across genes. SpaceX uses an over-dispersed spatial Poisson model coupled with a high-dimensional factor model which is based on a dimension reduction technique for computational efficiency. We show via simulations, accuracy gains in co-expression network estimation and structure by accounting for (increasing) spatial correlation and appropriate noise distributions. In-depth analysis of two spatial transcriptomics datasets in mouse hypothalamus and human breast cancer using SpaceX, detected multiple hub genes which are related to cognitive abilities for the hypothalamus data and multiple cancer genes (e.g. collagen family) from the tumor region for the breast cancer data.

    Availability and implementation

    The SpaceX R-package is available at github.com/bayesrx/SpaceX.

    Supplementary information

    Supplementary data are available at Bioinformatics online.

     
    more » « less
  2. Abstract Motivation

    Gene regulatory networks define regulatory relationships between transcription factors and target genes within a biological system, and reconstructing them is essential for understanding cellular growth and function. Methods for inferring and reconstructing networks from genomics data have evolved rapidly over the last decade in response to advances in sequencing technology and machine learning. The scale of data collection has increased dramatically; the largest genome-wide gene expression datasets have grown from thousands of measurements to millions of single cells, and new technologies are on the horizon to increase to tens of millions of cells and above.

    Results

    In this work, we present the Inferelator 3.0, which has been significantly updated to integrate data from distinct cell types to learn context-specific regulatory networks and aggregate them into a shared regulatory network, while retaining the functionality of the previous versions. The Inferelator is able to integrate the largest single-cell datasets and learn cell-type-specific gene regulatory networks. Compared to other network inference methods, the Inferelator learns new and informative Saccharomyces cerevisiae networks from single-cell gene expression data, measured by recovery of a known gold standard. We demonstrate its scaling capabilities by learning networks for multiple distinct neuronal and glial cell types in the developing Mus musculus brain at E18 from a large (1.3 million) single-cell gene expression dataset with paired single-cell chromatin accessibility data.

    Availability and implementation

    The inferelator software is available on GitHub (https://github.com/flatironinstitute/inferelator) under the MIT license and has been released as python packages with associated documentation (https://inferelator.readthedocs.io/).

    Supplementary information

    Supplementary data are available at Bioinformatics online.

     
    more » « less
  3. Abstract

    Intercellular mechanisms by which the stromal microenvironment contributes to solid tumor progression and targeted therapy resistance remain poorly understood, presenting significant clinical hurdles. PEAK1 (Pseudopodium-Enriched Atypical Kinase One) is an actin cytoskeleton- and focal adhesion-associated pseudokinase that promotes cell state plasticity and cancer metastasis by mediating growth factor-integrin signaling crosstalk. Here, we determined that stromal PEAK1 expression predicts poor outcomes in HER2-positive breast cancers high in SNAI2 expression and enriched for MSC content. Specifically, we identified that the fibroblastic stroma in HER2-positive breast cancer patient tissue stains positive for both nuclear SNAI2 and cytoplasmic PEAK1. Furthermore, mesenchymal stem cells (MSCs) and cancer-associated fibroblasts (CAFs) express high PEAK1 protein levels and potentiate tumorigenesis, lapatinib resistance and metastasis of HER2-positive breast cancer cells in a PEAK1-dependent manner. Analysis of PEAK1-dependent secreted factors from MSCs revealed INHBA/activin-A as a necessary factor in the conditioned media of PEAK1-expressing MSCs that promotes lapatinib resistance. Single-cell CycIF analysis of MSC-breast cancer cell co-cultures identified enrichment of p-Akthigh/p-gH2AXlow, MCL1high/p-gH2AXlowand GRP78high/VIMhighbreast cancer cell subpopulations by the presence of PEAK1-expressing MSCs and lapatinib treatment. Bioinformatic analyses on a PEAK1-centric stroma-tumor cell gene set and follow-up immunostaining of co-cultures predict targeting antiapoptotic and stress pathways as a means to improve targeted therapy responses and patient outcomes in HER2-positive breast cancer and other stroma-rich malignancies. These data provide the first evidence that PEAK1 promotes tumorigenic phenotypes through a previously unrecognized SNAI2-PEAK1-INHBA stromal cell axis.

     
    more » « less
  4. Abstract Background

    Enhancer of zeste homolog 2 (EZH2) catalyzes the trimethylation of histone H3 at lysine 27 via the polycomb recessive complex 2 (PRC2) and plays a time‐specific role in normal fetal liver development. EZH2 is overexpressed in hepatoblastoma (HB), an embryonal tumor. EZH2 can also promote tumorigenesis via a noncanonical, PRC2‐independent mechanism via proto‐oncogenic, direct protein interaction, including β‐catenin. We hypothesize that the pathological activation of EZH2 contributes to HB propagation in a PRC2‐independent manner.

    Methods and results

    We demonstrate that EZH2 promotes proliferation in HB tumor‐derived cell lines through interaction with β‐catenin. Although aberrant EZH2 expression occurs, we determine that both canonical and noncanonical EZH2 signaling occurs based on specific gene‐expression patterns and interaction with SUZ12, a PRC2 component, and β‐catenin. Silencing and inhibition of EZH2 reduce primary HB cell proliferation.

    Conclusions

    EZH2 overexpression promotes HB cell proliferation, with both canonical and noncanonical function detected. However, because EZH2 directly interacts with β‐catenin in human tumors and EZH2 overexpression is not equal to SUZ12, it seems that a noncanonical mechanism is contributing to HB pathogenesis. Further mechanistic studies are necessary to elucidate potential pathogenic downstream mechanisms and translational potential of EZH2 inhibitors for the treatment of HB.

     
    more » « less
  5. Abstract Motivation

    Gene regulatory networks (GRNs) in a cell provide the tight feedback needed to synchronize cell actions. However, genes in a cell also take input from, and provide signals to other neighboring cells. These cell–cell interactions (CCIs) and the GRNs deeply influence each other. Many computational methods have been developed for GRN inference in cells. More recently, methods were proposed to infer CCIs using single cell gene expression data with or without cell spatial location information. However, in reality, the two processes do not exist in isolation and are subject to spatial constraints. Despite this rationale, no methods currently exist to infer GRNs and CCIs using the same model.

    Results

    We propose CLARIFY, a tool that takes GRNs as input, uses them and spatially resolved gene expression data to infer CCIs, while simultaneously outputting refined cell-specific GRNs. CLARIFY uses a novel multi-level graph autoencoder, which mimics cellular networks at a higher level and cell-specific GRNs at a deeper level. We applied CLARIFY to two real spatial transcriptomic datasets, one using seqFISH and the other using MERFISH, and also tested on simulated datasets from scMultiSim. We compared the quality of predicted GRNs and CCIs with state-of-the-art baseline methods that inferred either only GRNs or only CCIs. The results show that CLARIFY consistently outperforms the baseline in terms of commonly used evaluation metrics. Our results point to the importance of co-inference of CCIs and GRNs and to the use of layered graph neural networks as an inference tool for biological networks.

    Availability and implementation

    The source code and data is available at https://github.com/MihirBafna/CLARIFY.

     
    more » « less