Statistical summaries of unlabelled evolutionary trees

Samyak, Rajanala; Palacios, Julia A

doi:10.1093/biomet/asad025

Citation Details

Statistical summaries of unlabelled evolutionary trees

Summary Rooted and ranked phylogenetic trees are mathematical objects that are useful in modelling hierarchical data and evolutionary relationships with applications to many fields such as evolutionary biology and genetic epidemiology. Bayesian phylogenetic inference usually explores the posterior distribution of trees via Markov chain Monte Carlo methods. However, assessing uncertainty and summarizing distributions remains challenging for these types of structures. While labelled phylogenetic trees have been extensively studied, relatively less literature exists for unlabelled trees that are increasingly useful, for example when one seeks to summarize samples of trees obtained with different methods, or from different samples and environments, and wishes to assess the stability and generalizability of these summaries. In our paper, we exploit recently proposed distance metrics of unlabelled ranked binary trees and unlabelled ranked genealogies, or trees equipped with branch lengths, to define the Fréchet mean, variance and interquartile sets as summaries of these tree distributions. We provide an efficient combinatorial optimization algorithm for computing the Fréchet mean of a sample or of distributions on unlabelled ranked tree shapes and unlabelled ranked genealogies. We show the applicability of our summary statistics for studying popular tree distributions and for comparing the SARS-CoV-2 evolutionary trees across different locations during the COVID-19 epidemic in 2020. Our current implementations are publicly available at https://github.com/RSamyak/fmatrix. more »

Award ID(s):: 2143242

PAR ID:: 10496156

Author(s) / Creator(s):: Samyak, Rajanala; Palacios, Julia A

Publisher / Repository:: Oxford University Press

Date Published:: 2023-04-26

Journal Name:: Biometrika

Edition / Version:: 1

Volume:: 111

Issue:: 1

ISSN:: 0006-3444

Page Range / eLocation ID:: 171-193

Subject(s) / Keyword(s):: Binary tree Combinatorial optimization Evolutionary tree Frechet mean Summarizing tree Unlabeled tree.

Format(s):: Medium: X Size: 1MB Other: pdf

Size(s):: 1MB

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1093/biomet/asad025

More Like this