skip to main content


Search for: All records

Award ID contains: 1657380

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Motivation

    The traditional view of cancer evolution states that a cancer genome accumulates a sequential ordering of mutations over a long period of time. However, in recent years it has been suggested that a cancer genome may instead undergo a one-time catastrophic event, such as chromothripsis, where a large number of mutations instead occur simultaneously. A number of potential signatures of chromothripsis have been proposed. In this work, we provide a rigorous formulation and analysis of the ‘ability to walk the derivative chromosome’ signature originally proposed by Korbel and Campbell. In particular, we show that this signature, as originally envisioned, may not always be present in a chromothripsis genome and we provide a precise quantification of under what circumstances it would be present. We also propose a variation on this signature, the H/T alternating fraction, which allows us to overcome some of the limitations of the original signature.

    Results

    We apply our measure to both simulated data and a previously analyzed real cancer dataset and find that the H/T alternating fraction may provide useful signal for distinguishing genomes having acquired mutations simultaneously from those acquired in a sequential fashion.

    Availability and implementation

    An implementation of the H/T alternating fraction is available at https://bitbucket.org/oesperlab/ht-altfrac.

    Supplementary information

    Supplementary data are available at Bioinformatics online.

     
    more » « less
  2. Abstract Motivation There has been recent increased interest in using algorithmic methods to infer the evolutionary tree underlying the developmental history of a tumor. Quantitative measures that compare such trees are vital to a number of different applications including benchmarking tree inference methods and evaluating common inheritance patterns across patients. However, few appropriate distance measures exist, and those that do have low resolution for differentiating trees or do not fully account for the complex relationship between tree topology and the inheritance of the mutations labeling that topology. Results Here we present two novel distance measures, Common Ancestor Set distance (CASet) and Distinctly Inherited Set Comparison distance (DISC), that are specifically designed to account for the subclonal mutation inheritance patterns characteristic of tumor evolutionary trees. We apply CASet and DISC to multiple simulated datasets and two breast cancer datasets and show that our distance measures allow for more nuanced and accurate delineation between tumor evolutionary trees than existing distance measures. Availability and implementation Implementations of CASet and DISC are freely available at: https://bitbucket.org/oesperlab/stereodist. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  3. Inspired by recent efforts to model cancer evolution with phylogenetic trees, we consider the problem of finding a consensus tumor evolution tree from a set of conflicting input trees. In contrast to traditional phylogenetic trees, the tumor trees we consider contain features such as mutation labels on internal vertices (in addition to the leaves) and allow multiple mutations to label a single vertex. We describe several distance measures between these tumor trees and present an algorithm to solve the consensus problem called GraPhyC. Our approach uses a weighted directed graph where vertices are sets of mutations and edges are weighted using a function that depends on the number of times a parental relationship is observed between their constituent mutations in the set of input trees. We find a minimum weight spanning arborescence in this graph and prove that the resulting tree minimizes the total distance to all input trees for one of our presented distance measures. We evaluate our GraPhyC method using both simulated and real data. On simulated data we show that our method outperforms a baseline method at finding an appropriate representative tree. Using a set of tumor trees derived from both whole-genome and deep sequencing data from a Chronic Lymphocytic Leukemia patient we find that our approach identifies a tree not included in the set of input trees, but that contains characteristics supported by other reported evolutionary reconstructions of this tumor. 
    more » « less
  4. A number of methods have recently been proposed to reconstruct the evolutionary history of a tumor from noisy DNA sequencing data. We investigate when and how well these histories can be reconstructed from multi-sample bulk sequencing data when considering only single nucleotide variants (SNVs). We formalize this as the Enumeration Variant Allele Frequency Factorization Problem and provide a novel proof for an upper bound on the number of possible phylogenies consistent with a given dataset. In addition, we propose and assess two methods for increasing the robustness and performance of an existing graph based phylogenetic inference method. We apply our approaches to noisy simulated data and find that low coverage and high noise make it more difficult to identify phylogenies. We also apply our methods to both chronic lymphocytic leukemia and clear cell renal cell carcinoma datasets. 
    more » « less