Phylogenetic network is an evolutionary model that uses a rooted directed acyclic graph (instead of a tree) to model an evolutionary history of species in which reticulate events (e.g., hybrid speciation or horizontal gene transfer) occurred. Tree-child network is a kind of phylogenetic network with structural constraints. Existing approaches for tree-child network reconstruction can be slow for large data. In this study, we present several computational approaches for bounding from below the number of reticulations in a tree-child network that displays a given set of rooted binary phylogenetic trees. In addition, we also present some theoretical results on bounding from above the number of reticulations. Through simulation, we demonstrate that the new lower bounds on the reticulation number for tree-child networks can practically be computed for large tree data. The bounds can provide estimates of reticulation for relatively large data.
more »
« less
All galls are divided into three or more parts: recursive enumeration of labeled histories for galled trees
Abstract Objective In mathematical phylogenetics, a labeled rooted binary tree topology can possess any of a number of labeled histories, each of which represents a possible temporal ordering of its coalescences. Labeled histories appear frequently in calculations that describe the combinatorics of phylogenetic trees. Here, we generalize the concept of labeled histories from rooted phylogenetic trees to rooted phylogenetic networks, specifically for the class of rooted phylogenetic networks known as rooted galled trees . Results Extending a recursive algorithm for enumerating the labeled histories of a labeled tree topology, we present a method to enumerate the labeled histories associated with a labeled rooted galled tree. The method relies on a recursive decomposition by which each gall in a galled tree possesses three or more descendant subtrees. We exhaustively provide the numbers of labeled histories for all small galled trees, finding that each gall reduces the number of labeled histories relative to a specified galled tree that does not contain it. Conclusion The results expand the set of structures for which labeled histories can be enumerated, extending a well-known calculation for phylogenetic trees to a class of phylogenetic networks.
more »
« less
- Award ID(s):
- 2116322
- PAR ID:
- 10412088
- Date Published:
- Journal Name:
- Algorithms for Molecular Biology
- Volume:
- 18
- Issue:
- 1
- ISSN:
- 1748-7188
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Tree-based phylogenetic networks, which may be roughly defined as leaf-labeled networks built by adding arcs only between the original tree edges, have elegant properties for modeling evolutionary histories. We answer an open question of Francis, Semple, and Steel about the complexity of determining how far a phylogenetic network is from being tree-based, including non-binary phylogenetic networks. We show that finding a phylogenetic tree covering the maximum number of nodes in a phylogenetic network can be computed in polynomial time via an encoding into a minimum-cost flow problem.more » « less
-
null (Ed.)Abstract We introduce a non-increasing tree growth process $$((T_n,{\sigma}_n),\, n\ge 1)$$ , where T n is a rooted labelled tree on n vertices and σ n is a permutation of the vertex labels. The construction of ( T n , σ n ) from ( T n −1 , σ n −1 ) involves rewiring a random (possibly empty) subset of edges in T n −1 towards the newly added vertex; as a consequence T n −1 ⊄ T n with positive probability. The key feature of the process is that the shape of T n has the same law as that of a random recursive tree, while the degree distribution of any given vertex is not monotone in the process. We present two applications. First, while couplings between Kingman’s coalescent and random recursive trees were known for any fixed n , this new process provides a non-standard coupling of all finite Kingman’s coalescents. Second, we use the new process and the Chen–Stein method to extend the well-understood properties of degree distribution of random recursive trees to extremal-range cases. Namely, we obtain convergence rates on the number of vertices with degree at least $$c\ln n$$ , c ∈ (1, 2), in trees with n vertices. Further avenues of research are discussed.more » « less
-
Abstract Rooted binarygalledtrees generalize rooted binary trees to allow a restricted class of cycles, known asgalls. We build upon the Wedderburn-Etherington enumeration of rooted binary unlabeled trees withnleaves to enumerate rooted binary unlabeled galled trees withnleaves, also enumerating rooted binary unlabeled galled trees withnleaves andggalls,$$0 \leqslant g \leqslant \lfloor \frac{n-1}{2} \rfloor $$ . The enumerations rely on a recursive decomposition that considers subtrees descended from the nodes of a gall, adopting a restriction on galls that amounts to considering only the rooted binarynormalunlabeled galled trees in our enumeration. We write an implicit expression for the generating function encoding the numbers of trees for alln. We show that the number of rooted binary unlabeled galled trees grows with$$0.0779(4.8230^n)n^{-\frac{3}{2}}$$ , exceeding the growth$$0.3188(2.4833^n)n^{-\frac{3}{2}}$$ of the number of rooted binary unlabeled trees without galls. However, the growth of the number of galled trees with only one gall has the same exponential order 2.4833 as the number with no galls, exceeding it only in the subexponential term,$$0.3910n^{\frac{1}{2}}$$ compared to$$0.3188n^{-\frac{3}{2}}$$ . For a fixed number of leavesn, the number of gallsgthat produces the largest number of rooted binary unlabeled galled trees lies intermediate between the minimum of$$g=0$$ and the maximum of$$g=\lfloor \frac{n-1}{2} \rfloor $$ . We discuss implications in mathematical phylogenetics.more » « less
-
The reconstruction of phylogenetic networks is an important but challenging problem in phylogenetics and genome evolution, as the space of phylogenetic networks is vast and cannot be sampled well. One approach to the problem is to solve the minimum phylogenetic network problem, in which phylogenetic trees are first inferred, and then the smallest phylogenetic network that displays all the trees is computed. The approach takes advantage of the fact that the theory of phylogenetic trees is mature, and there are excellent tools available for inferring phylogenetic trees from a large number of biomolecular sequences. A tree–child network is a phylogenetic network satisfying the condition that every nonleaf node has at least one child that is of indegree one. Here, we develop a new method that infers the minimum tree–child network by aligning lineage taxon strings in the phylogenetic trees. This algorithmic innovation enables us to get around the limitations of the existing programs for phylogenetic network inference. Our new program, named ALTS, is fast enough to infer a tree–child network with a large number of reticulations for a set of up to 50 phylogenetic trees with 50 taxa that have only trivial common clusters in about a quarter of an hour on average.more » « less
An official website of the United States government

