NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Selection leads to false inferences of introgression using popular methods

https://doi.org/10.1093/genetics/iyae089

Smith, Megan L; Hahn, Matthew W (May 2024, GENETICS)
Ralph, P (Ed.)
Abstract Detecting introgression between closely related populations or species is a fundamental objective in evolutionary biology. Existing methods for detecting migration and inferring migration rates from population genetic data often assume a neutral model of evolution. Growing evidence of the pervasive impact of selection on large portions of the genome across diverse taxa suggests that this assumption is unrealistic in most empirical systems. Further, ignoring selection has previously been shown to negatively impact demographic inferences (e.g. of population size histories). However, the impacts of biologically realistic selection on inferences of migration remain poorly explored. Here, we simulate data under models of background selection, selective sweeps, balancing selection, and adaptive introgression. We show that ignoring selection sometimes leads to false inferences of migration in popularly used methods that rely on the site frequency spectrum. Specifically, balancing selection and some models of background selection result in the rejection of isolation-only models in favor of isolation-with-migration models and lead to elevated estimates of migration rates. BPP, a method that analyzes sequence data directly, showed false positives for all conditions at recent divergence times, but balancing selection also led to false positives at medium-divergence times. Our results suggest that such methods may be unreliable in some empirical systems, such that new methods that are robust to selection need to be developed.
more » « less
Full Text Available
Applications of machine learning in phylogenetics

https://doi.org/10.1016/j.ympev.2024.108066

Mo, Yu K; Hahn, Matthew W; Smith, Megan L (July 2024, Molecular Phylogenetics and Evolution)

Machine learning has increasingly been applied to a wide range of questions in phylogenetic inference. Supervised machine learning approaches that rely on simulated training data have been used to infer tree topologies and branch lengths, to select substitution models, and to perform downstream inferences of introgression and diversification. Here, we review how researchers have used several promising machine learning approaches to make phylogenetic inferences. Despite the promise of these methods, several barriers prevent supervised machine learning from reaching its full potential in phylogenetics. We discuss these barriers and potential paths forward. In the future, we expect that the application of careful network designs and data encodings will allow supervised machine learning to accommodate the complex processes that continue to confound traditional phylogenetic methods.
more » « less
Full Text Available
Unprecedented female mutation bias in the aye-aye, a highly unusual lemur from Madagascar

https://doi.org/10.1371/journal.pbio.3003015

Wang, Richard J; Peña-García, Yadira; Raveendran, Muthuswamy; Harris, R Alan; Nguyen, Thuy-Trang; Gingras, Marie-Claude; Wu, Yifan; Perez, Lesette; Yoder, Anne D; Simmons, Joe H; et al (February 2025, PLOS Biology)
Hurst, Laurence D (Ed.)
Every mammal studied to date has been found to have a male mutation bias: male parents transmit more de novo mutations to offspring than female parents, contributing increasingly more mutations with age. Although male-biased mutation has been studied for more than 75 years, its causes are still debated. One obstacle to understanding this pattern is its near universality—without variation in mutation bias, it is difficult to find an underlying cause. Here, we present new data on multiple pedigrees from two primate species: aye-ayes (Daubentonia madagascariensis), a member of the strepsirrhine primates, and olive baboons (Papio anubis). In stark contrast to the pattern found across mammals, we find a much larger effect of maternal age than paternal age on mutation rates in the aye-aye. In addition, older aye-aye mothers transmit substantially more mutations than older fathers. We carry out both computational and experimental validation of our results, contrasting them with results from baboons and other primates using the same methodologies. Further, we analyze a set of DNA repair and replication genes to identify candidate mutations that may be responsible for the change in mutation bias observed in aye-ayes. Our results demonstrate that mutation bias is not an immutable trait, but rather one that can evolve between closely related species. Further work on aye-ayes (and possibly other lemuriform primates) should help to explain the molecular basis for sex-biased mutation.
more » « less
Free, publicly-accessible full text available February 7, 2026
Repeated behavioural evolution is associated with convergence of gene expression in cavity-nesting songbirds

https://doi.org/10.1038/s41559-025-02675-x

Lipshutz, Sara E; Hibbins, Mark S; Bentz, Alexandra B; Buechlein, Aaron M; Empson, Tara A; George, Elizabeth M; Hauber, Mark E; Rusch, Douglas B; Schelsky, Wendy M; Thomas, Quinn K; et al (May 2025, Nature Ecology & Evolution)

Free, publicly-accessible full text available May 1, 2026
Phylogenetic inference using generative adversarial networks

https://doi.org/10.1093/bioinformatics/btad543

Smith, Megan L.; Hahn, Matthew W.; Schwartz, ed., Russell (September 2023, Bioinformatics)

Abstract MotivationThe application of machine learning approaches in phylogenetics has been impeded by the vast model space associated with inference. Supervised machine learning approaches require data from across this space to train models. Because of this, previous approaches have typically been limited to inferring relationships among unrooted quartets of taxa, where there are only three possible topologies. Here, we explore the potential of generative adversarial networks (GANs) to address this limitation. GANs consist of a generator and a discriminator: at each step, the generator aims to create data that is similar to real data, while the discriminator attempts to distinguish generated and real data. By using an evolutionary model as the generator, we use GANs to make evolutionary inferences. Since a new model can be considered at each iteration, heuristic searches of complex model spaces are possible. Thus, GANs offer a potential solution to the challenges of applying machine learning in phylogenetics. ResultsWe developed phyloGAN, a GAN that infers phylogenetic relationships among species. phyloGAN takes as input a concatenated alignment, or a set of gene alignments, and infers a phylogenetic tree either considering or ignoring gene tree heterogeneity. We explored the performance of phyloGAN for up to 15 taxa in the concatenation case and 6 taxa when considering gene tree heterogeneity. Error rates are relatively low in these simple cases. However, run times are slow and performance metrics suggest issues during training. Future work should explore novel architectures that may result in more stable and efficient GANs for phylogenetics. Availability and implementationphyloGAN is available on github: https://github.com/meganlsmith/phyloGAN/.
more » « less
Phylogenomic comparative methods: Accurate evolutionary inferences in the presence of gene tree discordance

https://doi.org/10.1073/pnas.2220389120

Hibbins, Mark S; Breithaupt, Lara C; Hahn, Matthew W (May 2023, Proceedings of the National Academy of Sciences)

Phylogenetic comparative methods have long been a mainstay of evolutionary biology, allowing for the study of trait evolution across species while accounting for their common ancestry. These analyses typically assume a single, bifurcating phylogenetic tree describing the shared history among species. However, modern phylogenomic analyses have shown that genomes are often composed of mosaic histories that can disagree both with the species tree and with each other—so-called discordant gene trees. These gene trees describe shared histories that are not captured by the species tree, and therefore that are unaccounted for in classic comparative approaches. The application of standard comparative methods to species histories containing discordance leads to incorrect inferences about the timing, direction, and rate of evolution. Here, we develop two approaches for incorporating gene tree histories into comparative methods: one that constructs an updated phylogenetic variance–covariance matrix from gene trees, and another that applies Felsenstein's pruning algorithm over a set of gene trees to calculate trait histories and likelihoods. Using simulation, we demonstrate that our approaches generate much more accurate estimates of tree-wide rates of trait evolution than standard methods. We apply our methods to two clades of the wild tomato genusSolanumwith varying rates of discordance, demonstrating the contribution of gene tree discordance to variation in a set of floral traits. Our approaches have the potential to be applied to a broad range of classic inference problems in phylogenetics, including ancestral state reconstruction and the inference of lineage-specific rate shifts.
more » « less
Full Text Available
Ecological and anthropogenic effects on the genomic diversity of lemurs in Madagascar

https://doi.org/10.1038/s41559-024-02596-1

Orkin, Joseph D; Kuderna, Lukas_F K; Hermosilla-Albala, Núria; Fontsere, Claudia; Aylward, Megan L; Janiak, Mareike C; Andriaholinirina, Nicole; Balaresque, Patricia; Blair, Mary E; Fausser, Jean-Luc; et al (January 2025, Nature Ecology & Evolution)

Full Text Available
CAGEE: Computational Analysis of Gene Expression Evolution

https://doi.org/10.1093/molbev/msad106

Bertram, Jason; Fulton, Ben; Tourigny, Jason P; Peña-Garcia, Yadira; Moyle, Leonie C; Hahn, Matthew W (May 2023, Molecular Biology and Evolution)
Nowick, Katja (Ed.)
Despite the increasing abundance of whole transcriptome data, few methods are available to analyze global gene expression across phylogenies. Here, we present a new software package (CAGEE) for inferring patterns of increases and decreases in gene expression across a phylogenetic tree, as well as the rate at which these changes occur. In contrast to previous methods that treat each gene independently, CAGEE can calculate genome-wide rates of gene expression, along with ancestral states for each gene. The statistical approach developed here makes it possible to infer lineage-specific shifts in rates of evolution across the genome, in addition to possible differences in rates among multiple tissues sampled from the same species. We demonstrate the accuracy and robustness of our method on simulated data, and apply it to a data set of ovule gene expression collected from multiple self-compatible and self-incompatible species in the genus Solanum to test hypotheses about the evolutionary forces acting during mating system shifts. These comparisons allow us to highlight the power of CAGEE, demonstrating its utility for use in any empirical system and for the analysis of most morphological traits. Our software is available at https://github.com/hahnlab/CAGEE/.
more » « less
Full Text Available
Human generation times across the past 250,000 years

https://doi.org/10.1126/sciadv.abm7047

Wang, Richard J.; Al-Saffar, Samer I.; Rogers, Jeffrey; Hahn, Matthew W. (January 2023, Science Advances)

Men have always had children at an older age than women, even among diverse populations, but this age gap has recently shrunk.
more » « less
Full Text Available
Repeated behavioral evolution is associated with targeted convergence of gene expression in cavity-nesting songbirds

https://doi.org/10.1101/2024.02.13.580205

Lipshutz, Sara E; Hibbins, Mark S; Bentz, Alexandra B; Buechlin, Aaron M; Empson, Tara A; George, Elizabeth M; Hauber, Mark E; Rusch, Douglas B; Schelsky, Wendy M; Thomas, Quinn K; et al (February 2024, bioRxiv)

Uncovering the genomic bases of phenotypic adaptation is a major goal in biology, but this has been hard to achieve for complex behavioral traits. Here, we leverage the repeated, independent evolution of obligate cavity-nesting in birds to test the hypothesis that pressure to compete for a limited breeding resource has facilitated convergent evolution in behavior, hormones, and gene expression. We used an integrative approach, combining aggression assays in the field, testosterone measures, and transcriptome-wide analyses of the brain in wild-captured females and males. Our experimental design compared species pairs across five avian families, each including one obligate cavity-nesting species and a related species with a more flexible nest strategy. We find behavioral convergence, with higher levels of territorial aggression in obligate cavity-nesters, particularly among females. Across species, levels of testosterone in circulation were not associated with nest strategy, nor aggression. Phylogenetic analyses of individual genes and co-regulated gene networks revealed more shared patterns of brain gene expression than expected by drift, but the scope of convergent gene expression evolution was limited to a small percent of the genome. When comparing our results to other studies that did not use phylogenetic methods, we suggest that accounting for shared evolutionary history may reduce the number of genes inferred as convergently evolving. Altogether, we find that behavioral convergence in response to shared ecological pressures is associated with largely independent gene expression evolution across different avian families, punctuated by a narrow set of convergently evolving genes.
more » « less
Full Text Available

« Prev Next »

Search for: All records