Abstract A prominent challenge for managing migratory species is the development of conservation plans that accommodate spatiotemporally varying distributions throughout the year. Migratory networks are spatially‐explicit models that incorporate migratory assignment and seasonal abundance data to define patterns of connectivity between stages of the annual cycle. These models are particularly useful for widespread application because different types of migratory data can be used to quantify individual and population‐level movement across the annual cycle of migratory species. While there are clear benefits of combining migratory assignment and abundance data for the development of conservation strategies, there is a concurrent need for corresponding user‐friendly software to facilitate the integration of these data for conservation.Here, we presentmignette(migratory network tools ensemble), an R package for developing migratory network models to estimate network connectivity among migratory populations. We demonstrate the functionality ofmignettewith three empirical examples that highlight the use of different types of tracking data for migratory assignment.mignettefacilitates the modelling of migratory networks by providing R functions to: (1) define breeding and nonbreeding nodes, (2) assemble abundance and assignment data and (3) model the migratory network. Additionally,mignetteprovides R functions to visualize modelled migratory networks.With increasing availability of migratory assignment and abundance data,mignetterepresents a valuable tool for developing effective conservation strategies for migratory species.
more »
« less
Population assignment from genotype likelihoods for low‐coverage whole‐genome sequencing data
Abstract Low‐coverage whole‐genome sequencing (WGS) is increasingly used for the study of evolution and ecology in both model and non‐model organisms; however, effective application of low‐coverage WGS data requires the implementation of probabilistic frameworks to account for the uncertainties in genotype likelihoods.Here, we present a probabilistic framework for using genotype likelihoods for standard population assignment applications. Additionally, we derive the Fisher information for allele frequency from genotype likelihoods and use that to describe a novel metric, theeffective sample size, which figures heavily in assignment accuracy. We make these developments available for application through WGSassign, an open‐source software package that is computationally efficient for working with whole‐genome data.Using simulated and empirical data sets, we demonstrate the behaviour of our assignment method across a range of population structures, sample sizes and read depths. Through these results, we show that WGSassign can provide highly accurate assignment, even for samples with low average read depths (<0.01X) and among weakly differentiated populations.Our simulation results highlight the importance of equalizing the effective sample sizes among source populations in order to achieve accurate population assignment with low‐coverage WGS data. We further provide study design recommendations for population assignment studies and discuss the broad utility of effective sample size for studies using low‐coverage WGS data.
more »
« less
- Award ID(s):
- 1942313
- PAR ID:
- 10510114
- Publisher / Repository:
- Methods in Ecology and Evolution
- Date Published:
- Journal Name:
- Methods in Ecology and Evolution
- Volume:
- 15
- Issue:
- 3
- ISSN:
- 2041-210X
- Page Range / eLocation ID:
- 493 to 510
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Stamatakis, Alexandros (Ed.)Abstract Motivation Comparative genome analysis of two or more whole-genome sequenced (WGS) samples is at the core of most applications in genomics. These include the discovery of genomic differences segregating in populations, case-control analysis in common diseases and diagnosing rare disorders. With the current progress of accurate long-read sequencing technologies (e.g. circular consensus sequencing from PacBio sequencers), we can dive into studying repeat regions of the genome (e.g. segmental duplications) and hard-to-detect variants (e.g. complex structural variants). Results We propose a novel framework for comparative genome analysis through the discovery of strings that are specific to one genome (‘samples-specific’ strings). We have developed a novel, accurate and efficient computational method for the discovery of sample-specific strings between two groups of WGS samples. The proposed approach will give us the ability to perform comparative genome analysis without the need to map the reads and is not hindered by shortcomings of the reference genome and mapping algorithms. We show that the proposed approach is capable of accurately finding sample-specific strings representing nearly all variation (>98%) reported across pairs or trios of WGS samples using accurate long reads (e.g. PacBio HiFi data). Availability and implementation Data, code and instructions for reproducing the results presented in this manuscript are publicly available at https://github.com/Parsoa/PingPong. Supplementary information Supplementary data are available at Bioinformatics Advances online.more » « less
-
Abstract Objective:Whole genome sequencing (WGS) can help identify transmission of pathogens causing healthcare-associated infections (HAIs). However, the current gold standard of short-read, Illumina-based WGS is labor and time intensive. Given recent improvements in long-read Oxford Nanopore Technologies (ONT) sequencing, we sought to establish a low resource approach providing accurate WGS-pathogen comparison within a time frame allowing for infection prevention and control (IPC) interventions. Methods:WGS was prospectively performed on pathogens at increased risk of potential healthcare transmission using the ONT MinION sequencer with R10.4.1 flow cells and Dorado basecaller. Potential transmission was assessed via Ridom SeqSphere+ for core genome multilocus sequence typing and MINTyper for reference-based core genome single nucleotide polymorphisms using previously published cutoff values. The accuracy of our ONT pipeline was determined relative to Illumina. Results:Over a six-month period, 242 bacterial isolates from 216 patients were sequenced by a single operator. Compared to the Illumina gold standard, our ONT pipeline achieved a mean identity score of Q60 for assembled genomes, even with a coverage rate as low as 40×. The mean time from initiating DNA extraction to complete analysis was 2 days (IQR 2–3.25 days). We identified five potential transmission clusters comprising 21 isolates (8.7% of sequenced strains). Integrating ONT with epidemiological data, >70% (15/21) of putative transmission cluster isolates originated from patients with potential healthcare transmission links. Conclusions:Via a stand-alone ONT pipeline, we detected potentially transmitted HAI pathogens rapidly and accurately, aligning closely with epidemiological data. Our low-resource method has the potential to assist in IPC efforts.more » « less
-
Abstract Global change is impacting biodiversity across all habitats on earth. New selection pressures from changing climatic conditions and other anthropogenic activities are creating heterogeneous ecological and evolutionary responses across many species' geographic ranges. Yet we currently lack standardised and reproducible tools to effectively predict the resulting patterns in species vulnerability to declines or range changes.We developed an informatic toolbox that integrates ecological, environmental and genomic data and analyses (environmental dissimilarity, species distribution models, landscape connectivity, neutral and adaptive genetic diversity, genotype‐environment associations and genomic offset) to estimate population vulnerability. In our toolbox, functions and data structures are coded in a standardised way so that it is applicable to any species or geographic region where appropriate data are available, for example individual or population sampling and genomic datasets (e.g. RAD‐seq, ddRAD‐seq, whole genome sequencing data) representing environmental variation across the species geographic range.To demonstrate multi‐species applicability, we apply our toolbox to three georeferenced genomic datasets for co‐occurring East African spiny reed frogs (Afrixalus fornasini, A. delicatusandA. sylvaticus) to predict their population vulnerability, as well as demonstrating that range loss projections based on adaptive variation can be accurately reproduced from a previous study using data for two European bat species (Myotis escaleraiandM. crypticus).Our framework sets the stage for large scale, multi‐species genomic datasets to be leveraged in a novel climate change vulnerability framework to quantify intraspecific differences in genetic diversity, local adaptation, range shifts and population vulnerability based on exposure, sensitivity and landscape barriers.more » « less
-
Abstract Understanding migratory connectivity, or the linkage of populations between seasons, is critical for effective conservation and management of migratory wildlife. A growing number of tools are available for understanding where migratory individuals and populations occur throughout the annual cycle. Integration of the diverse measures of migratory movements can help elucidate migratory connectivity patterns with methodology that accounts for differences in sampling design, directionality, effort, precision and bias inherent to each data type.The R packageMigConnectivitywas developed to estimate population‐specific connectivity and the range‐wide strength of those connections. New functions allow users to integrate intrinsic markers, tracking and long‐distance reencounter data, collected from the same or different individuals, to estimate population‐specific transition probabilities (estTransition) and the range‐wide strength of those transition probabilities (estStrength). We used simulation and real‐world case studies to explore the challenges and limitations of data integration based on data from three migratory bird species, Painted Bunting (Passerina ciris), Yellow Warbler (Setophaga petechia) and Bald Eagle (Haliaeetus leucocephalus), two of which had bidirectional data.We found data integration is useful for quantifying migratory connectivity, as single data sources are less likely to be available across the species range. Furthermore, accurate strength estimates can be obtained from either breeding‐to‐nonbreeding or nonbreeding‐to‐breeding data. For bidirectional data, integration can lead to more accurate estimates when data are available from all regions in at least one season.The ability to conduct combined analyses that account for the unique limitations and biases of each data type is a promising possibility for overcoming the challenge of range‐wide coverage that has been hard to achieve using single data types. The best‐case scenario for data integration is to have data from all regions, especially if the question is range‐wide or data are bidirectional. Multiple data types on animal movements are becoming increasingly available and integration of these growing datasets will lead to a better understanding of the full annual cycle of migratory animals.more » « less
An official website of the United States government

