Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
null (Ed.)Abstract Motivation The standard bootstrap method is used throughout science and engineering to perform general-purpose non-parametric resampling and re-estimation. Among the most widely cited and widely used such applications is the phylogenetic bootstrap method, which Felsenstein proposed in 1985 as a means to place statistical confidence intervals on an estimated phylogeny (or estimate ‘phylogenetic support’). A key simplifying assumption of the bootstrap method is that input data are independent and identically distributed (i.i.d.). However, the i.i.d. assumption is an over-simplification for biomolecular sequence analysis, as Felsenstein noted. Results In this study, we introduce a new sequence-aware non-parametric resampling technique, which we refer to as RAWR (‘RAndom Walk Resampling’). RAWR consists of random walks that synthesize and extend the standard bootstrap method and the ‘mirrored inputs’ idea of Landan and Graur. We apply RAWR to the task of phylogenetic support estimation. RAWR’s performance is compared to the state-of-the-art using synthetic and empirical data that span a range of dataset sizes and evolutionary divergence. We show that RAWR support estimates offer comparable or typically superior type I and type II error compared to phylogenetic bootstrap support. We also conduct a re-analysis of large-scale genomic sequence data from a recent study of Darwin’s finches. Our findings clarify phylogenetic uncertainty in a charismatic clade that serves as an important model for complex adaptive evolution. Availability and implementation Data and software are publicly available under open-source software and open data licenses at: https://gitlab.msu.edu/liulab/RAWR-study-datasets-and-scripts.more » « less
-
Abstract Understanding the origin and maintenance of adaptive phenotypic novelty is a central goal of evolutionary biology. However, both hybridization and incomplete lineage sorting can lead to genealogical discordance between the regions of the genome underlying adaptive traits and the remainder of the genome, decoupling inferences about character evolution from population history. Here, to disentangle these effects, we investigated the evolutionary origins and maintenance of Batesian mimicry between North American admiral butterflies (Limenitis arthemis) and their chemically defended model (Battus philenor) using a combination of de novo genome sequencing, whole-genome resequencing, and statistical introgression mapping. Our results suggest that balancing selection, arising from geographic variation in the presence or absence of the unpalatable model, has maintained two deeply divergent color patterning haplotypes that have been repeatedly sieved among distinct mimetic and nonmimetic lineages of Limenitis via introgressive hybridization.more » « less
-
An emerging discovery in phylogenomics is that interspecific gene flow has played a major role in the evolution of many different organ- isms. To what extent is the Tree of Life not truly a tree reflecting strict “vertical” divergence, but rather a more general graph structure known as a phylogenetic network which also captures “horizontal” gene flow? The answer to this fundamental question not only depends upon densely sam- pled and divergent genomic sequence data, but also computational meth- ods which are capable of accurately and efficiently inferring phylogenetic networks from large-scale genomic sequence datasets. Recent methodolog- ical advances have attempted to address this gap. However, in the 2016 performance study of Hejase and Liu, state-of-the-art methods fell well short of the scalability requirements of existing phylogenomic studies. The methodological gap remains: how can phylogenetic networks be accurately and efficiently inferred using genomic sequence data involv- ing many dozens or hundreds of taxa? In this study, we address this gap by proposing a new phylogenetic divide-and-conquer method which we call FastNet. We conduct a performance study involving a range of evolutionary scenarios, and we demonstrate that FastNet outperforms state-of-the-art methods in terms of computational efficiency and topo- logical accuracy.more » « less
An official website of the United States government
