skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: ClonArch: visualizing the spatial clonal architecture of tumors
Abstract Motivation Cancer is caused by the accumulation of somatic mutations that lead to the formation of distinct populations of cells, called clones. The resulting clonal architecture is the main cause of relapse and resistance to treatment. With decreasing costs in DNA sequencing technology, rich cancer genomics datasets with many spatial sequencing samples are becoming increasingly available, enabling the inference of high-resolution tumor clones and prevalences across different spatial coordinates. While temporal and phylogenetic aspects of tumor evolution, such as clonal evolution over time and clonal response to treatment, are commonly visualized in various clonal evolution diagrams, visual analytics methods that reveal the spatial clonal architecture are missing. Results This article introduces ClonArch, a web-based tool to interactively visualize the phylogenetic tree and spatial distribution of clones in a single tumor mass. ClonArch uses the marching squares algorithm to draw closed boundaries representing the presence of clones in a real or simulated tumor. ClonArch enables researchers to examine the spatial clonal architecture of a subset of relevant mutations at different prevalence thresholds and across multiple phylogenetic trees. In addition to simulated tumors with varying number of biopsies, we demonstrate the use of ClonArch on a hepatocellular carcinoma tumor with ∼280 sequencing biopsies. ClonArch provides an automated way to interactively examine the spatial clonal architecture of a tumor, facilitating clinical and biological interpretations of the spatial aspects of intra-tumor heterogeneity. Availability and implementation https://github.com/elkebir-group/ClonArch.  more » « less
Award ID(s):
1850502
PAR ID:
10289252
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Bioinformatics
Volume:
36
Issue:
Supplement_1
ISSN:
1367-4803
Page Range / eLocation ID:
i161 to i168
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Background Every tumor is composed of heterogeneous clones, each corresponding to a distinct subpopulation of cells that accumulated different types of somatic mutations, ranging from single-nucleotide variants (SNVs) to copy-number aberrations (CNAs). As the analysis of this intra-tumor heterogeneity has important clinical applications, several computational methods have been introduced to identify clones from DNA sequencing data. However, due to technological and methodological limitations, current analyses are restricted to identifying tumor clones only based on either SNVs or CNAs, preventing a comprehensive characterization of a tumor’s clonal composition. Results To overcome these challenges, we formulate the identification of clones in terms of both SNVs and CNAs as a integration problem while accounting for uncertainty in the input SNV and CNA proportions. We thus characterize the computational complexity of this problem and we introduce PACTION (PArsimonious Clone Tree integratION), an algorithm that solves the problem using a mixed integer linear programming formulation. On simulated data, we show that tumor clones can be identified reliably, especially when further taking into account the ancestral relationships that can be inferred from the input SNVs and CNAs. On 49 tumor samples from 10 prostate cancer patients, our integration approach provides a higher resolution view of tumor evolution than previous studies. Conclusion PACTION is an accurate and fast method that reconstructs clonal architecture of cancer tumors by integrating SNV and CNA clones inferred using existing methods. 
    more » « less
  2. Cancer results from an evolutionary process that typically yields multiple clones with varying sets of mutations within the same tumor. Accurately modeling this process is key to understanding and predicting cancer evolution. Here, we introduce clone to mutation (CloMu), a flexible and low-parameter tree generative model of cancer evolution. CloMu uses a two-layer neural network trained via reinforcement learning to determine the probability of new mutations based on the existing mutations on a clone. CloMu supports several prediction tasks, including the determination of evolutionary trajectories, tree selection, causality and interchangeability between mutations, and mutation fitness. Importantly, previous methods support only some of these tasks, and many suffer from overfitting on data sets with a large number of mutations. Using simulations, we show that CloMu either matches or outperforms current methods on a wide variety of prediction tasks. In particular, for simulated data with interchangeable mutations, current methods are unable to uncover causal relationships as effectively as CloMu. On breast cancer and leukemia cohorts, we show that CloMu determines similarities and causal relationships between mutations as well as the fitness of mutations. We validate CloMu's inferred mutation fitness values for the leukemia cohort by comparing them to clonal proportion data not used during training, showing high concordance. In summary, CloMu's low-parameter model facilitates a wide range of prediction tasks regarding cancer evolution on increasingly available cohort-level data sets. 
    more » « less
  3. Abstract Motivation Clinical sequencing aims to identify somatic mutations in cancer cells for accurate diagnosis and treatment. However, most widely used clinical assays lack patient-matched control DNA and additional analysis is needed to distinguish somatic and unfiltered germline variants. Such computational analyses require accurate assessment of tumor cell content in individual specimens. Histological estimates often do not corroborate with results from computational methods that are primarily designed for normal-tumor matched data and can be confounded by genomic heterogeneity and presence of sub-clonal mutations. Methods All-FIT is an iterative weighted least square method to estimate specimen tumor purity based on the allele frequencies of variants detected in high-depth, targeted, clinical sequencing data. Results Using simulated and clinical data, we demonstrate All-FIT’s accuracy and improved performance against leading computational approaches, highlighting the importance of interpreting purity estimates based on expected biology of tumors. Availability and Implementation Freely available at http://software.khiabanian-lab.org. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  4. Inspired by recent efforts to model cancer evolution with phylogenetic trees, we consider the problem of finding a consensus tumor evolution tree from a set of conflicting input trees. In contrast to traditional phylogenetic trees, the tumor trees we consider contain features such as mutation labels on internal vertices (in addition to the leaves) and allow multiple mutations to label a single vertex. We describe several distance measures between these tumor trees and present an algorithm to solve the consensus problem called GraPhyC. Our approach uses a weighted directed graph where vertices are sets of mutations and edges are weighted using a function that depends on the number of times a parental relationship is observed between their constituent mutations in the set of input trees. We find a minimum weight spanning arborescence in this graph and prove that the resulting tree minimizes the total distance to all input trees for one of our presented distance measures. We evaluate our GraPhyC method using both simulated and real data. On simulated data we show that our method outperforms a baseline method at finding an appropriate representative tree. Using a set of tumor trees derived from both whole-genome and deep sequencing data from a Chronic Lymphocytic Leukemia patient we find that our approach identifies a tree not included in the set of input trees, but that contains characteristics supported by other reported evolutionary reconstructions of this tumor. 
    more » « less
  5. With the advent of single-cell DNA sequencing, it is now possible to infer the evolutionary history of thousands of tumor cells obtained from a single patient. This evolutionary history, which takes the shape of a tree, reveals the mode of evolution of the specific cancer under study and, in turn, helps with clinical diagnosis, prognosis, and therapeutic treatment. In this study we focus on the question of determining the mode of evolution of tumor cells from their inferred evolutionary history. In particular, we employ recursive neural networks that capture tree structures to classify the evolutionary history of tumor cells into one of four modes—linear, branching, neutral, and punctuated. We trained our model, MoTERNN, using simulated data in a supervised fashion and applied it to a real phylogenetic tree obtained from single-cell DNA sequencing data. MoTERNN is implemented in Python and is publicly available at https://github.com/NakhlehLab/MoTERNN. 
    more » « less