skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Fast and flexible estimation of effective migration surfaces
Spatial population genetic data often exhibits ‘isolation-by-distance,’ where genetic similarity tends to decrease as individuals become more geographically distant. The rate at which genetic similarity decays with distance is often spatially heterogeneous due to variable population processes like genetic drift, gene flow, and natural selection. Petkova et al., 2016 developed a statistical method called Estimating Effective Migration Surfaces (EEMS) for visualizing spatially heterogeneous isolation-by-distance on a geographic map. While EEMS is a powerful tool for depicting spatial population structure, it can suffer from slow runtimes. Here, we develop a related method called Fast Estimation of Effective Migration Surfaces (FEEMS). FEEMS uses a Gaussian Markov Random Field model in a penalized likelihood framework that allows for efficient optimization and output of effective migration surfaces. Further, the efficient optimization facilitates the inference of migration parameters per edge in the graph, rather than per node (as in EEMS). With simulations, we show conditions under which FEEMS can accurately recover effective migration surfaces with complex gene-flow histories, including those with anisotropy. We apply FEEMS to population genetic data from North American gray wolves and show it performs favorably in comparison to EEMS, with solutions obtained orders of magnitude faster. Overall, FEEMS expands the ability of users to quickly visualize and interpret spatial structure in their data.  more » « less
Award ID(s):
1654076
PAR ID:
10331198
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
eLife
Volume:
10
ISSN:
2050-084X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The estimation of malaria parasite migration can play a vital role in informing elimination strategies by pinpointing regions with higher parasite migration that act as transmission sources, and that could be the focus of elimination interventions. Gene flow simulation methods such as Estimated Effective Migration Surfaces (EEMS) and Migration and Population-Size Surfaces (MAPS) use a Markov Chain Monte Carlo simulation-based approach to visualize a species' migration and diversity. These methods utilize georeferenced genomic data and present output in the form of migration contour maps. Despite their potential, there is uncertainty in EEMS and MAPS outputs when sampling locations are sparse - an aspect that remains under-explored in current research. We present a framework designed to systematically assess the impact of sample locations and sample size on migration contours in gene flow simulations that goes beyond the posterior probability map available in EEMS. We test our framework using publicly available genomic data collected from Cambodia and border regions of Thailand, Vietnam, and Laos during 2008-2013. The methodology leverages kernel density estimation and topological skeletons in conjunction with other spatial analysis methods to quantify the impact of sparse sample locations on gene flow simulations. Multiple sample resolutions were tested against a baseline resolution, and the findings highlight how migration contours vary with sampling resolution and how our approach can be applied to guide the production and mapping of reliable migration contours. Our research provides valuable insights about both the reliability and precision of model outputs when employing gene flow simulation techniques e.g., EEMS and MAPS, to estimate malaria parasite migration. The findings revealed that by employing our approach, we were able to maintain approximately 67% consistency between the contours and the reference dataset, even when utilizing only half of the sample locations. This knowledge will improve both the reliability and precision of these model outputs in future studies. 
    more » « less
  2. Abstract Pouched lamprey (Geotria australis) or kanakana/piharau is a culturally and ecologically significant jawless fish that is distributed throughout Aotearoa New Zealand. Despite its importance, much remains unknown about historical relationships and gene flow between populations of this enigmatic species within New Zealand. To help inform management, we assembled a draft Geotria australis genome and completed the first comprehensive population genomics analysis of pouched lamprey within New Zealand using targeted gene sequencing (Cyt-b and COI) and restriction site-associated DNA sequencing (RADSeq) methods. Employing 16,000 genome-wide single nucleotide polymorphisms (SNPs) derived from RADSeq (n=186) and sequence data from Cyt-b (766 bp, n=94) and COI (589 bp, n=20), we reveal low levels of structure across 10 sampling locations spanning the species range within New Zealand. F-statistics, outlier analyses, and STRUCTURE suggest a single panmictic population, and Mantel and EEMS tests reveal no significant isolation by distance. This implies either ongoing gene flow among populations or recent shared ancestry among New Zealand pouched lamprey. We can now use the information gained from these genetic tools to assist managers with monitoring effective population size, managing potential diseases, and conservation measures such as artificial propagation programs. We further demonstrate the general utility of these genetic tools for acquiring information about elusive species. 
    more » « less
  3. Abstract In high-latitude species with high dispersal ability, such as long-distance migratory birds, populations are often assumed to exhibit little genetic structure due to high gene flow or recent postglacial expansion. We sequenced over 120 low-coverage whole genomes from across the breeding range of a long-distance migratory bird, the Veery (Catharus fuscescens), revealing strong evidence for isolation by distance. Additionally, we found distinct genetic structure between boreal, western montane U.S., and southern Appalachian sampling regions. We suggest that population genetic structure in this highly migratory species is detectable with the high resolution afforded by whole-genomic data because, similar to many migratory birds, the Veery exhibits high breeding-site fidelity, which likely limits gene flow. Resolution of isolation by distance across the breeding range was sufficient to assign likely breeding origins of individuals sampled in this species’ poorly understood South American nonbreeding range, demonstrating the potential to assess migratory connectivity in this species using genomic data. As the Veery’s breeding range extends across both historically glaciated and unglaciated regions in North America, we also evaluated whether contemporary patterns of structure and genetic diversity are consistent with historical population isolation in glacial refugia. We found that patterns of genetic diversity did not support southern montane regions (southern Appalachians or western U.S. mountains) as glacial refugia. Overall, our findings suggest that isolation by distance yields subtle associations between genetic structure and geography across the breeding range of this highly vagile species even in the absence of obvious historical vicariance or contemporary barriers to dispersal. 
    more » « less
  4. Abstract Genetic connectivity lies at the heart of evolutionary theory, and landscape genetics has rapidly advanced to understand how gene flow can be impacted by the environment. Isolation by landscape resistance, often inferred through the use of circuit theory, is increasingly identified as being critical for predicting genetic connectivity across complex landscapes. Yet landscape impediments to migration can arise from fundamentally different processes, such as landscape gradients causing directional migration and mortality during migration, which can be challenging to address. Spatial absorbing Markov chains (SAMC) have been introduced to understand and predict these (and other) processes affecting connectivity in ecological settings, but the relationship of this framework to landscape genetics remains unclear. Here, we relate the SAMC to population genetics theory, provide simulations to interpret the extent to which the SAMC can predict genetic metrics and demonstrate how the SAMC can be applied to genomic data using an example with an endangered species, the Panama City crayfish Procambarus econfinae , where directional migration is hypothesized to occur. The use of the SAMC for landscape genetics can be justified based on similar grounds to using circuit theory, as we show how circuit theory is a special case of this framework. The SAMC can extend circuit‐theoretic connectivity modelling by quantifying both directional resistance to migration and acknowledging the difference between migration mortality and resistance to migration. Our empirical example highlights that the SAMC better predicts population structure than circuit theory and least‐cost analysis by acknowledging asymmetric environmental gradients (i.e. slope) and migration mortality in this species. These results provide a foundation for applying the SAMC to landscape genetics. This framework extends isolation‐by‐resistance modelling to account for some common processes that can impact gene flow, which can improve predicting genetic connectivity across complex landscapes. 
    more » « less
  5. Abstract One key research goal of evolutionary biology is to understand the origin and maintenance of genetic variation. In the Cerrado, the South American savanna located primarily in the Central Brazilian Plateau, many hypotheses have been proposed to explain how landscape features (e.g., geographic distance, river barriers, topographic compartmentalization, and historical climatic fluctuations) have promoted genetic structure by mediating gene flow. Here, we asked whether these landscape features have influenced the genetic structure and differentiation in the lizard speciesNorops brasiliensis(Squamata: Dactyloidae). To achieve our goal, we used a genetic clustering analysis and estimate an effective migration surface to assess genetic structure in the focal species. Optimized isolation-by-resistance models and a simulation-based approach combined with machine learning (convolutional neural network; CNN) were then used to infer current and historical effects on population genetic structure through 12 unique landscape models. We recovered five geographically distributed populations that are separated by regions of lower-than-expected gene flow. The results of the CNN showed that geographic distance is the sole predictor of genetic variation inN. brasiliensis, and that slope, rivers, and historical climate had no discernible influence on gene flow. Our novel CNN approach was accurate (89.5%) in differentiating each landscape model. CNN and other machine learning approaches are still largely unexplored in landscape genetics studies, representing promising avenues for future research with increasingly accessible genomic datasets. 
    more » « less