Low‐coverage whole‐genome sequencing (WGS) is increasingly used for the study of evolution and ecology in both model and non‐model organisms; however, effective application of low‐coverage WGS data requires the implementation of probabilistic frameworks to account for the uncertainties in genotype likelihoods. Here, we present a probabilistic framework for using genotype likelihoods for standard population assignment applications. Additionally, we derive the Fisher information for allele frequency from genotype likelihoods and use that to describe a novel metric, the Using simulated and empirical data sets, we demonstrate the behaviour of our assignment method across a range of population structures, sample sizes and read depths. Through these results, we show that WGSassign can provide highly accurate assignment, even for samples with low average read depths (<0.01X) and among weakly differentiated populations. Our simulation results highlight the importance of equalizing the effective sample sizes among source populations in order to achieve accurate population assignment with low‐coverage WGS data. We further provide study design recommendations for population assignment studies and discuss the broad utility of effective sample size for studies using low‐coverage WGS data.
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract effective sample size , which figures heavily in assignment accuracy. We make these developments available for application through WGSassign, an open‐source software package that is computationally efficient for working with whole‐genome data. -
Abstract Migration is driven by a combination of environmental and genetic factors, but many questions remain about those drivers. Potential interactions between genetic and environmental variants associated with different migratory phenotypes are rarely the focus of study. We pair low coverage whole genome resequencing with a de novo genome assembly to examine population structure, inbreeding, and the environmental factors associated with genetic differentiation between migratory and resident breeding phenotypes in a species of conservation concern, the western burrowing owl (
Athene cunicularia hypugaea ). Our analyses reveal a dichotomy in gene flow depending on whether the population is resident or migratory, with the former being genetically structured and the latter exhibiting no signs of structure. Among resident populations, we observed significantly higher genetic differentiation, significant isolation‐by‐distance, and significantly elevated inbreeding. Among migratory breeding groups, on the other hand, we observed lower genetic differentiation, no isolation‐by‐distance, and substantially lower inbreeding. Using genotype–environment association analysis, we find significant evidence for relationships between migratory phenotypes (i.e., migrant versus resident) and environmental variation associated with cold temperatures during the winter and barren, open habitats. In the regions of the genome most differentiated between migrants and residents, we find significant enrichment for genes associated with the metabolism of fats. This may be linked to the increased pressure on migrants to process and store fats more efficiently in preparation for and during migration. Our results provide a significant contribution toward understanding the evolution of migratory behavior and vital insight into ongoing conservation and management efforts for the western burrowing owl. -
Abstract Understanding the geographic linkages among populations across the annual cycle is an essential component for understanding the ecology and evolution of migratory species and for facilitating their effective conservation. While genetic markers have been widely applied to describe migratory connections, the rapid development of new sequencing methods, such as low‐coverage whole genome sequencing (lcWGS), provides new opportunities for improved estimates of migratory connectivity. Here, we use lcWGS to identify fine‐scale population structure in a widespread songbird, the American Redstart (
Setophaga ruticilla ), and accurately assign individuals to genetically distinct breeding populations. Assignment of individuals from the nonbreeding range reveals population‐specific patterns of varying migratory connectivity. By combining migratory connectivity results with demographic analysis of population abundance and trends, we consider full annual cycle conservation strategies for preserving numbers of individuals and genetic diversity. Notably, we highlight the importance of the Northern Temperate‐Greater Antilles migratory population as containing the largest proportion of individuals in the species. Finally, we highlight valuable considerations for other population assignment studies aimed at using lcWGS. Our results have broad implications for improving our understanding of the ecology and evolution of migratory species through conservation genomics approaches. -
Abstract Global loss of biodiversity has placed new urgency on the need to understand factors regulating species response to rapid environmental change. While specialists are often less resilient to rapid environmental change than generalists, species‐level analyses may obscure the extent of specialization when locally adapted populations vary in climate tolerances. Until recently, quantification of the degree of climate specialization in migratory birds below the species level was hindered by a lack of genomic and tracking information, but recent technological advances have helped to overcome these barriers. Here we take a genome‐wide genetic approach to mapping population‐specific migratory routes and quantifying niche breadth within genetically distinct populations of a migratory bird, the willow flycatcher (
Empidonax traillii ), which exhibits variation in the severity of population declines across its breeding range. While our sample size is restricted to the number of genetically distinct populations within the species, our results support the idea that locally adapted populations of the willow flycatcher with narrow climatic niches across seasons are already federally listed as endangered or in steep decline, while populations with broader climatic niches have remained stable in recent decades. Overall, this work highlights the value of quantifying niche breadth within genetically distinct groups across time and space when attempting to understand the factors that facilitate or constrain the response of locally adapted populations to rapid environmental change. -
Abstract Marine species with pelagic larvae typically exhibit little population structure, suggesting long‐distance dispersal and high gene flow. Directly quantifying dispersal of marine fishes is challenging but important, particularly for the design of marine protected areas (MPAs). Here, we studied kelp rockfish (
Sebastes atrovirens ) sampled along ~25 km of coastline in a boundary current‐dominated ecosystem and used genetic parentage analysis to identify dispersal events and characterize them, because the distance between sedentary parents and their settled offspring is the lifetime dispersal distance. Large sample sizes and intensive sampling are critical for increasing the likelihood of detecting parent–offspring matches in such systems and we sampled more than 6,000 kelp rockfish and analysed them with a powerful set of 96 microhaplotype markers. We identified eight parent–offspring pairs with high confidence, including two juvenile fish that were born inside MPAs and dispersed to areas outside MPAs, and four fish born in MPAs that dispersed to nearby MPAs. Additionally, we identified 25 full‐sibling pairs, which occurred throughout the sampling area and included all possible combinations of inferred dispersal trajectories. Intriguingly, these included two pairs of young‐of‐the‐year siblings with one member each sampled in consecutive years. These sibling pairs suggest monogamy, either intentional or accidental, which has not been previously demonstrated in rockfishes. This study provides the first direct observation of larval dispersal events in a current‐dominated ecosystem and direct evidence that larvae produced within MPAs are exported both to neighbouring MPAs and to proximate areas where harvest is allowed. -
Abstract The accelerating rate at which
DNA sequence data are now generated by high‐throughput sequencing instruments provides both opportunities and challenges for population genetic and ecological investigations of animals and plants. We show here how the common practice of calling genotypes from a singleSNP per sequenced region ignores substantial additional information in the phased short‐read sequences that are provided by these sequencing instruments. We target sequenced regions with multipleSNP s in kelp rockfish (Sebastes atrovirens ) to determine “microhaplotypes” and then call these microhaplotypes as alleles at each locus. We then demonstrate how these multi‐allelic marker data from such loci dramatically increase power for relationship inference. The microhaplotype approach decreases false‐positive rates by several orders of magnitude, relative to calling bi‐allelicSNP s, for two challenging analytical procedures, full‐sibling and single parent–offspring pair identification. We also show how the identification of half‐sibling pairs requires so much data that physical linkage becomes a consideration, and that most published studies that attempt to do so are dramatically underpowered. The advent of phased short‐readDNA sequence data, in conjunction with emerging analytical tools for their analysis, promises to improve efficiency by reducing the number of loci necessary for a particular level of statistical confidence, thereby lowering the cost of data collection and reducing the degree of physical linkage amongst markers used for relationship estimation. Such advances will facilitate collaborative research and management for migratory and other widespread species. -
Abstract New computational methods and next‐generation sequencing (NGS) approaches have enabled the use of thousands or hundreds of thousands of genetic markers to address previously intractable questions. The methods and massive marker sets present both new data analysis challenges and opportunities to visualize, understand, and apply population and conservation genomic data in novel ways. The large scale and complexity of NGS data also increases the expertise and effort required to thoroughly and thoughtfully analyze and interpret data. To aid in this endeavor, a recent workshop entitled “Population Genomic Data Analysis,” also known as “ConGen 2017,” was held at the University of Montana. The ConGen workshop brought 15 instructors together with knowledge in a wide range of topics including NGS data filtering, genome assembly, genomic monitoring of effective population size, migration modeling, detecting adaptive genomic variation, genomewide association analysis, inbreeding depression, and landscape genomics. Here, we summarize the major themes of the workshop and the important take‐home points that were offered to students throughout. We emphasize increasing participation by women in population and conservation genomics as a vital step for the advancement of science. Some important themes that emerged during the workshop included the need for data visualization and its importance in finding problematic data, the effects of data filtering choices on downstream population genomic analyses, the increasing availability of whole‐genome sequencing, and the new challenges it presents. Our goal here is to help motivate and educate a worldwide audience to improve population genomic data analysis and interpretation, and thereby advance the contribution of genomics to molecular ecology, evolutionary biology, and especially to the conservation of biodiversity.