skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Phylogeographic model selection using convolutional neural networks
Abstract The discipline of phylogeography has evolved rapidly in terms of the analytical toolkit used to analyse large genomic data sets. Despite substantial advances, analytical tools that could potentially address the challenges posed by increased model complexity have not been fully explored. For example, deep learning techniques are underutilized for phylogeographic model selection. In non‐model organisms, the lack of information about their ecology and evolution can lead to uncertainty about which demographic models are appropriate. Here, we assess the utility of convolutional neural networks (CNNs) for assessing demographic models in South American lizards in the genusNorops. Three demographic scenarios (constant, expansion, and bottleneck) were considered for each of four inferred population‐level lineages, and we found that the overall model accuracy was higher than 98% for all lineages. We then evaluated a set of 26 models that accounted for evolutionary relationships, gene flow, and changes in effective population size among the four lineages, identifying a single model with an estimated overall accuracy of 87% when using CNNs. The inferred demography of the lizard system suggests that gene flow between non‐sister populations and changes in effective population sizes through time, probably in response to Pleistocene climatic oscillations, have shaped genetic diversity in this system. Approximate Bayesian computation (ABC) was applied to provide a comparison to the performance of CNNs. ABC was unable to identify a single model among the larger set of 26 models in the subsequent analysis. Our results demonstrate that CNNs can be easily and usefully incorporated into the phylogeographer's toolkit.  more » « less
Award ID(s):
1831319
PAR ID:
10566027
Author(s) / Creator(s):
; ; ;
Editor(s):
Fountain-Jones, Nicholas M; Smith, Megan L; Austerlitz, Frédéric
Publisher / Repository:
Molecular Ecology Resources
Date Published:
Journal Name:
Molecular Ecology Resources
Volume:
21
Issue:
8
ISSN:
1755-098X
Page Range / eLocation ID:
2661 to 2675
Subject(s) / Keyword(s):
convolutional neural networks, deep learning, machine learning, Norops spp., phylogeogeography
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Climate change poses a threat to biodiversity, and it is unclear whether species can adapt to or tolerate new conditions, or migrate to areas with suitable habitats. Reconstructions of range shifts that occurred in response to environmental changes since the last glacial maximum (LGM) from species distribution models (SDMs) can provide useful data to inform conservation efforts. However, different SDM algorithms and climate reconstructions often produce contrasting patterns, and validation methods typically focus on accuracy in recreating current distributions, limiting their relevance for assessing predictions to the past or future. We modeled historically suitable habitat for the threatened North American tree green ashFraxinus pennsylvanicausing 24 SDMs built using two climate models, three calibration regions, and four modeling algorithms. We evaluated the SDMs using contemporary data with spatial block cross‐validation and compared the relative support for alternative models using a novel integrative method based on coupled demographic‐genetic simulations. We simulated genomic datasets using habitat suitability of each of the 24 SDMs in a spatially‐explicit model. Approximate Bayesian computation (ABC) was then used to evaluate the support for alternative SDMs through comparisons to an empirical population genomic dataset. Models had very similar performance when assessed with contemporary occurrences using spatial cross‐validation, but ABC model selection analyses consistently supported SDMs based on the CCSM climate model, an intermediate calibration extent, and the generalized linear modeling algorithm. Finally, we projected the future range of green ash under four climate change scenarios. Future projections using the SDMs selected via ABC suggest only minor shifts in suitable habitat for this species, while some of those that were rejected predicted dramatic changes. Our results highlight the different inferences that may result from the application of alternative distribution modeling algorithms and provide a novel approach for selecting among a set of competing SDMs with independent data. 
    more » « less
  2. Abstract Population demographic changes, alongside landscape, geographic and climate heterogeneity, can influence the timing, stability and extent of introgression where species hybridise. Thus, quantifying interactions across diverged lineages, and the relative contributions of interspecific genetic exchange and selection to divergence at the genome‐wide level is needed to better understand the drivers of hybrid zone formation and maintenance. We used seven latitudinally arrayed transects to quantify the contributions of climate, geography and landscape features to broad patterns of genetic structure across the hybrid zone ofPopulus trichocarpaandP. balsamiferaand evaluated the demographic context of hybridisation over time. We found genetic structure differed among the seven transects. While ancestry was structured by climate, landscape features influenced gene flow dynamics. Demographic models indicated a secondary contact event may have influenced contemporary hybrid zone formation with the origin of a putative hybrid lineage that inhabits regions with higher aridity than either of the ancestral groups. Phylogenetic relationships based on chloroplast genomes support the origin of this hybrid lineage inferred from demographic models based on the nuclear data. Our results point towards the importance of climate and landscape patterns in structuring the contact zones betweenP. trichocarpaandP. balsamiferaand emphasise the value whole genome sequencing can have to advancing our understanding of how neutral processes influence divergence across space and time. 
    more » « less
  3. Due to pervasive gene flow and admixture, simple bifurcating trees often do not provide an accurate representation of relationships among diverging lineages, but limited resolution in the available genomic data and the spatial distribution of samples has hindered detailed insights regarding the evolutionary and demographic history of many species and populations. In this issue ofMolecular Ecology, Foote et al. (2019) combine a powerful sampling design with novel analytical methods adopted from human genetics to describe previously unrecognized patterns of recurrent vicariance and admixture among lineages in the globally distributed killer whale (Orcinus orca). Based on sequence data from modern samples alone, they discover clear signatures of ancient admixture with a now extinct “ghost” lineage, providing one of the first accounts of archaic introgression in a nonhominid species. Coupling a cost‐effective sequencing strategy with novel analytical approaches, their paper provides a roadmap for advancing inference of evolutionary history in other nonmodel species, promising exciting times ahead for our field. 
    more » « less
  4. Admixture appears increasingly ubiquitous in the evolutionary history of various taxa, including humans. Such gene flow likely also occurred among our closest living relatives: bonobos ( Pan paniscus ) and chimpanzees ( Pan troglodytes ). However, our understanding of their evolutionary history has been limited by studies that do not consider all Pan lineages or do not analyze all lineages simultaneously, resulting in conflicting demographic models. Here, we investigate this gap in knowledge using nucleotide site patterns calculated from whole-genome sequences from the autosomes of 71 bonobos and chimpanzees, representing all five extant Pan lineages. We estimated demographic parameters and compared all previously proposed demographic models for this clade. We further considered sex bias in Pan evolutionary history by analyzing the site patterns from the X chromosome. We show that 1) 21% of autosomal DNA in eastern chimpanzees derives from western chimpanzee introgression and that 2) all four chimpanzee lineages share a common ancestor about 987,000 y ago, much earlier than previous estimates. In addition, we suggest that 3) there was male reproductive skew throughout Pan evolutionary history and find evidence of 4) male-biased dispersal from western to eastern chimpanzees. Collectively, these results offer insight into bonobo and chimpanzee evolutionary history and suggest considerable differences between current and historic chimpanzee biogeography. 
    more » « less
  5. Abstract New study systems and tools are needed to understand how divergence and speciation occur between lineages with gene flow. Migratory birds often exhibit divergence despite seasonal migration, which brings populations into contact with one another. We studied divergence between 2 subspecies of Northern Saw-whet Owl (Aegolius acadicus), in which a sedentary population on the islands of Haida Gwaii, British Columbia (A. a. brooksi), exists in the presence of the other form (A. a. acadicus) during migration but not during the breeding season. Prior research showed fixed mtDNA divergence but left open the question of nuclear gene flow. We used 2,517 ultraconserved element loci to examine the demographic history of this young taxon pair. Although we did not observe fixed single nucleotide polymorphism differences between populations among our genotyped individuals, 100% of the birds were diagnosable and δaδI analyses suggested the demographic model best fitting the data was one of split-bidirectional-migration (i.e. speciation with gene flow). We dated the split between brooksi and acadicus to ~278 Kya, and our analyses suggested gene flow between groups was skewed, with ~0.7 individuals per generation coming from acadicus into brooksi and ~4.4 going the opposite direction. Coupled with an absence of evidence of phenotypic hybrids and the birds’ natural history, these data suggest brooksi may be a young biological species arising despite historic gene flow. 
    more » « less