skip to main content


Title: A weighted exact test for mutually exclusive mutations in cancer
Abstract Motivation

The somatic mutations in the pathways that drive cancer development tend to be mutually exclusive across tumors, providing a signal for distinguishing driver mutations from a larger number of random passenger mutations. This mutual exclusivity signal can be confounded by high and highly variable mutation rates across a cohort of samples. Current statistical tests for exclusivity that incorporate both per-gene and per-sample mutational frequencies are computationally expensive and have limited precision.

Results

We formulate a weighted exact test for assessing the significance of mutual exclusivity in an arbitrary number of mutational events. Our test conditions on the number of samples with a mutation as well as per-event, per-sample mutation probabilities. We provide a recursive formula to compute P-values for the weighted test exactly as well as a highly accurate and efficient saddlepoint approximation of the test. We use our test to approximate a commonly used permutation test for exclusivity that conditions on per-event, per-sample mutation frequencies. However, our test is more efficient and it recovers more significant results than the permutation test. We use our Weighted Exclusivity Test (WExT) software to analyze hundreds of colorectal and endometrial samples from The Cancer Genome Atlas, which are two cancer types that often have extremely high mutation rates. On both cancer types, the weighted test identifies sets of mutually exclusive mutations in cancer genes with fewer false positives than earlier approaches.

Availability and Implementation

See http://compbio.cs.brown.edu/projects/wext for software.

Contact

braphael@cs.brown.edu

Supplementary information

Supplementary data are available at Bioinformatics online.

 
more » « less
NSF-PAR ID:
10394828
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Bioinformatics
Volume:
32
Issue:
17
ISSN:
1367-4803
Page Range / eLocation ID:
p. i736-i745
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Mutual exclusivity of cancer driving mutations is a frequently observed phenomenon in the mutational landscape of cancer. The long tail of rare mutations complicates the discovery of mutually exclusive driver modules. The existing methods usually suffer from the problem that only few genes in some identified modules cover most of the cancer samples. To overcome this hurdle, an efficient method UniCovEx is presented via identifying mutually exclusive driver modules of balanced exclusive coverages. UniCovEx first searches for candidate driver modules with a strong topological relationship in signaling networks using a greedy strategy. It then evaluates the candidate modules by considering their coverage, exclusivity, and balance of coverage, using a novel metric termed exclusive entropy of modules, which measures how balanced the modules are. Finally, UniCovEx predicts sample‐specific driver modules by solving a minimum set cover problem using a greedy strategy. When tested on 12 The Cancer Genome Atlas datasets of different cancer types, UniCovEx shows a significant superiority over the previous methods. The software is available at:https://sourceforge.net/projects/cancer‐pathway/files/.

     
    more » « less
  2. Abstract

    Motivation: Cancer is the process of accumulating genetic alterations that confer selective advantages to tumor cells. The order in which aberrations occur is not arbitrary, and inferring the order of events is challenging due to the lack of longitudinal samples from tumors. Moreover, a network model of oncogenesis should capture biological facts such as distinct progression trajectories of cancer subtypes and patterns of mutual exclusivity of alterations in the same pathways.

    In this paper, we present the disjunctive Bayesian network (DBN), a novel oncogenetic model with a phylogenetic interpretation. DBN is expressive enough to capture cancer subtypes' trajectories and mutually exclusive relations between alterations from unstratified data.

    Results: In cases where the number of studied alterations is small (), we provide an efficient dynamic programming implementation of an exact structure learning method that finds a best DBN in the superexponential search space of networks. In rare cases that the number of alterations is large, we provided an efficient genetic algorithm in our software package, OncoBN. Through numerous synthetic and real data experiments, we show OncoBN's ability in inferring ground truth networks and recovering biologically meaningful progression networks.

    Availability: OncoBN is implemented in R and is available athttps://github.com/phillipnicol/OncoBN.

     
    more » « less
  3. Abstract Background/objectives

    While outcomes for pediatric T‐cell acute lymphoblastic leukemia (T‐ALL) are favorable, there are few widely accepted prognostic factors, limiting the ability to risk stratify therapy.

    Design/methods

    Dana‐Farber Cancer Institute (DFCI) Protocols 05‐001 and 11‐001 enrolled pediatric patients with newly diagnosed B‐ or T‐ALL from 2005 to 2011 and from 2012 to 2015, respectively. Protocol therapy was nearly identical for patients with T‐ALL (N = 123), who were all initially assigned to the high‐risk arm. End‐induction minimal residual disease (MRD) was assessed by reverse transcription polymerase chain reaction (RT‐PCR) or next‐generation sequencing (NGS), but was not used to modify postinduction therapy. Early T‐cell precursor (ETP) status was determined by flow cytometry. Cases with sufficient diagnostic DNA were retrospectively evaluated by targeted NGS of known genetic drivers of T‐ALL, including Notch, PI3K, and Ras pathway genes.

    Results

    The 5‐year event‐free survival (EFS) and overall survival (OS) for patients with T‐ALL was 81% (95% CI, 73‐87%) and 90% (95% CI, 83‐94%), respectively. ETP phenotype was associated with failure to achieve complete remission, but not with inferior OS. Low end‐induction MRD (<10−4) was associated with superior disease‐free survival (DFS). Pathogenic mutations of the PI3K pathway were mutually exclusive of ETP phenotype and were associated with inferior 5‐year DFS and OS.

    Conclusions

    Together, our findings demonstrate that ETP phenotype, end‐induction MRD, and PI3K pathway mutation status are prognostically relevant in pediatric T‐ALL and should be considered for risk classification in future trials. DFCI Protocols 05‐001 and 11‐001 are registered atwww.clinicaltrials.govas NCT00165087 and NCT01574274, respectively.

     
    more » « less
  4. null (Ed.)
    A major challenge in cancer genomics is to identify genes with functional roles in cancer and uncover their mechanisms of action. We introduce an integrative framework that identifies cancer-relevant genes by pinpointing those whose interaction or other functional sites are enriched in somatic mutations across tumors. We derive analytical calculations that enable us to avoid time-prohibitive permutation-based significance tests, making it computationally feasible to simultaneously consider multiple measures of protein site functionality. Our accompanying software, PertInInt, combines knowledge about sites participating in interactions with DNA, RNA, peptides, ions, or small molecules with domain, evolutionary conservation, and gene-level mutation data. When applied to 10,037 tumor samples, PertInInt uncovers both known and newly predicted cancer genes, while additionally revealing what types of interactions or other functionalities are disrupted. PertInInt’s analysis demonstrates that somatic mutations are frequently enriched in interaction sites and domains and implicates interaction perturbation as a pervasive cancer-driving event. 
    more » « less
  5. Abstract Motivation

    The traditional view of cancer evolution states that a cancer genome accumulates a sequential ordering of mutations over a long period of time. However, in recent years it has been suggested that a cancer genome may instead undergo a one-time catastrophic event, such as chromothripsis, where a large number of mutations instead occur simultaneously. A number of potential signatures of chromothripsis have been proposed. In this work, we provide a rigorous formulation and analysis of the ‘ability to walk the derivative chromosome’ signature originally proposed by Korbel and Campbell. In particular, we show that this signature, as originally envisioned, may not always be present in a chromothripsis genome and we provide a precise quantification of under what circumstances it would be present. We also propose a variation on this signature, the H/T alternating fraction, which allows us to overcome some of the limitations of the original signature.

    Results

    We apply our measure to both simulated data and a previously analyzed real cancer dataset and find that the H/T alternating fraction may provide useful signal for distinguishing genomes having acquired mutations simultaneously from those acquired in a sequential fashion.

    Availability and implementation

    An implementation of the H/T alternating fraction is available at https://bitbucket.org/oesperlab/ht-altfrac.

    Supplementary information

    Supplementary data are available at Bioinformatics online.

     
    more » « less