skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Recent Climate Change and Historical Population Structure Predict Spatial Patterns of Admixture Between Two Host‐Specialised Pine Sawfly Species
ABSTRACT Human disturbance can have profound effects on biodiversity, including increasing hybridization between reproductively isolated species. One approach for understanding how human activity affects hybridization dynamics is to evaluate correlations between disturbance (e.g., urbanisation, temperature change) and hybridization. Because variation in hybridization can also arise from historical factors unrelated to recent human disturbance, it is essential to account for population structure to avoid spurious correlations. Here, we combine environmental and high‐coverage whole‐genome resequencing data to investigate how human disturbance and population structure affect hybridization dynamics between a pair of pine sawflies adapted to different pines,Neodiprion leconteiandNeodiprion pinetum. We find thatN. leconteiandN. pinetumexhibit strikingly different patterns of population structure, which we hypothesise stem from differences in host use. We also find that recent admixture is both asymmetric and geographically variable. Linear regression analyses reveal that admixture proportion is predicted by indirect human disturbance (i.e., climate change) and not direct human disturbance (e.g., urbanisation) in bothN. leconteiandN. pinetum. Lastly, inN. pinetum, we find evidence of a spurious association between admixture and direct human disturbance that disappears when regression models account for population structure via inclusion of genetic principal component scores as covariates. Together, our data suggest that indirect human disturbance and population structure both contribute to geographic variation in admixture betweenN. leconteiandN. pinetum. Our study also highlights the importance of adequately controlling for population structure when attempting to identify environmental predictors (human disturbance‐related or not) of hybridization.  more » « less
Award ID(s):
2348574 1750946
PAR ID:
10668763
Author(s) / Creator(s):
 ;  
Publisher / Repository:
Wiley
Date Published:
Journal Name:
Molecular Ecology
Volume:
34
Issue:
24
ISSN:
0962-1083
Page Range / eLocation ID:
e70183
Format(s):
Medium: X
Associated Dataset(s):
View Associated Dataset(s) >>
Sponsoring Org:
National Science Foundation
More Like this
  1. ABSTRACT. Urbanisation has led to increasing homogenization of plant communities across cities. However, it is unclear whether these patterns extend to cosmopolitan plant species at the genetic level. We examined genome‐wide genetic patterns in six widespread plant species (three Poaceae and three Asteraceae) across five cities in the USA (Boston, Baltimore, Minneapolis‐St. Paul, Phoenix, and Los Angeles) using reduced‐representation sequencing. We assessed genetic structure, differentiation, and patterns of isolation by distance (IBD) and environment (IBE) to determine if species were genetically homogeneous or differentiated by city, percentage of impervious surface, or both. Most species exhibited limited population structure overall, withPoa annua(annual bluegrass),Taraxacum officinale(dandelion), andCynodon dactylon(Bermuda grass) showing no significant genetic differentiation among cities, a pattern consistent with high gene flow mediated by human activity. Notable exceptions included city‐level differences inErigeron canadensis(horseweed) andLactuca serriola(prickly lettuce), especially in Phoenix. We also observed low genetic diversity inDigitaria sanguinalis(crabgrass) from Phoenix, suggesting recent founder effects or selection via environmental filtering.Erigeron canadensis,the only native species studied, displayed stronger differentiation by city, along with significant isolation by temperature and distance. Among all species, we found no evidence for population structure by impervious surface. Our findings indicate that widespread population genetic structure patterns of cosmopolitan plants are likely to depend more on species attributes (e.g., self‐compatibility) and human‐mediated dispersal than on urbanisation per se. 
    more » « less
  2. {"Abstract":["Human disturbance can have profound effects on biodiversity, including\n increasing hybridization between reproductively isolated species. One\n approach for understanding how human activity affects hybridization\n dynamics is to evaluate correlations between disturbance (e.g.,\n urbanization, temperature change) and hybridization. Because variation in\n hybridization can also arise from historical factors unrelated to recent\n human disturbance, it is essential to account for population structure to\n avoid spurious correlations. Here, we combine environmental and\n high-coverage whole-genome resequencing data to investigate how human\n disturbance and population structure affect hybridization dynamics between\n a pair of pine sawflies adapted to different pines, Neodiprion lecontei\n and Neodiprion pinetum. We find that N. lecontei and N. pinetum exhibit\n strikingly different patterns of population structure, which we\n hypothesize stems from differences in host. We also find that recent\n admixture is both asymmetric and geographically variable. Linear\n regression analyses reveal that admixture proportion is predicted by\n indirect human disturbance (i.e., climate change) and not direct human\n disturbance (e.g., urbanization) in both N. lecontei and N. pinetum.\n Lastly, in N. pinetum, we find evidence of a spurious association between\n admixture and direct human disturbance that disappears when regression\n models account for population structure via inclusion of genetic principal\n component scores as covariates. Together, our data suggest that indirect\n human disturbance and population structure both contribute to geographic\n variation in admixture between N. lecontei and N. pinetum. Our study also\n highlights the importance of adequately controlling for population\n structure when attempting to identify environmental predictors (human\n disturbance-related or not) of hybridization."],"TechnicalInfo":["# Recent climate change and historical population structure predict\n spatial patterns of admixture between two host-specialized pine sawfly\n species Dataset DOI: [10.5061/dryad.kh18932jw](10.5061/dryad.kh18932jw) ##\n Description of the data and file structure To perform all analyses, run\n scripts provided in the order indicated in the folder/script name (i.e.,\n "01*" to "07*"). NOTE: for all bash scripts (*.sh),\n you will have to modify them to run on your cluster. However, the program\n commands should work as long as you are using the same version of the\n program (program versions indicated in the Materials and Method section of\n the manuscript). ### Files and variables #### File:\n Glover_Linnen_dryad.zip **Description:**  Input files required to run\n analyses are provided: 1.\n "allsites_LecPineHetr_filtered_ExclIndvs_minQ20_AllImbHom_refilter_bcftools_norm.vcf.gz" - required for ABBA-BABA analysis ("06_ABBA-BABA" folder); vcf file contains variant and invariant sites for all *N. lecontei*, *N. pinetum*, and *N. hetricki* individuals included in analysis 2. "snps_LecPine_filtered_ExclIndvs_MAF05_minQ20_AllImbHom_refilter_pruned_bcftools_norm.vcf.gz" - required for ADMIXTURE ("03_ADMIXTURE" folder) and principal component analysis (PCA) for both species combined ("04_run_pcas.sh" script) analyses; vcf file contains linkage-pruned SNPs for all *N. lecontei* and *N. pinetum* individuals included in analyses (i.e., contaminated/putative haploid male individuals removed) 3. "snps_LecOnly_filtered_ExclIndvs_MAF05_minQ20_AllImbHom_refilter_pruned_bcftools_norm.vcf.gz" - required for PCA for *N. lecontei* only analysis ("04_run_pcas.sh" script"); vcf file contains linkage-pruned SNPs for all *N. lecontei* individuals included in analysis (i.e., contaminated/putative haploid male individuals removed) 4. "snps_PineOnly_noHybrid_filtered_ExclIndvs_MAF05_minQ20_AllImbHom_refilter_pruned_bcftools_norm.vcf.gz" - required for PCA for *N. pinetum *only analysis ("04_run_pcas.sh" script"); vcf file contains linkage-pruned SNPs for all *N. pinetum *individuals included in analysis (i.e., contaminated/putative haploid male individuals and F1 hybrid called *N. pinetum* at time of collection due to being collected on white pine removed) 5. "fulldata_LecPine.csv", "fulldata_LecOnly.csv", "fulldata_PineOnly.csv" - full/compiled data for all *N. lecontei* and *N. pinetum* individuals together, all *N. lecontei* individuals, and all *N. pinetum* individuals included in downstream analyses, respectively. These files are required to run the "05_plot_PCAs_ADMIXTURE.R" and "07_run_models.R" scripts. Description of columns in the *.csv files (#5) above (each row represents an individual sawfly): 1. "ind" = unique individual identifier assigned by the sequencing center (Admera Health) 2. "genPC1_*" = genetic PC1 score from PCA of both species combined ("genPC1_LecPine" in "fulldata_LecPine.csv" file), *N. lecontei* only ("genPC1_LecOnly" in "fulldata_LecOnly.csv" file), or *N. pinetum* only ("genPC1_PineOnly" in "fulldata_PineOnly.csv" file) 3. "genPC2_*" = genetic PC2 score from PCA of both species combined ("genPC2_LecPine" in "fulldata_LecPine.csv" file), *N. lecontei* only ("genPC2_LecOnly" in "fulldata_LecOnly.csv" file), or *N. pinetum* only ("genPC2_PineOnly" in "fulldata_PineOnly.csv" file) 4. "genPC3_*" = genetic PC3 score from PCA of both species combined ("genPC3_LecPine" in "fulldata_LecPine.csv" file), *N. lecontei* only ("genPC3_LecOnly" in "fulldata_LecOnly.csv" file), or *N. pinetum* only ("genPC3_PineOnly" in "fulldata_PineOnly.csv" file) 5. "AG_ID" = unique individual identifier assigned by the Linnen lab 6. "avg_perc_PP_reads_mapped" = percentage of properly paired reads mapped; averaged between the two sequencing batches 7. "MERGED_FINAL_read_count" = total read count across two sequencing batches 8. "species" = species identity 9. "city" = city individual was collected in 10. "state" = state individual was collected in 11. "area" = geographic area ("North" or "Central") that individual was assigned to based on ADMIXTURE ("03_ADMIXTURE" folder) and PCA ("04_run_pcas.sh" script) analyses 12. "latitude" = latitude individual was collected at 13. "longitude" = longitude individual was collected at 14. "host" = pine host individual was collected on 15. "stage" = life stage of individual at time of DNA extraction, either "larva" or adult female ("adultF") 16. "lcPC1" = PC1 score from PCA of land cover proportions (from analyses in "01_LandCover_data" folder) 17. "lcPC2" = PC2 score from PCA of land cover proportions (from analyses in "01_LandCover_data" folder) 18. "change_tmax" = change in average daily maximum temperature from the 1980's to the 2010's in degrees Celsius (from analyses in "02_temperature_data" folder) 19. "site_number" = arbitrary number assigned to sampling site; each unique sampling site (i.e., unique latitude/longitude GPS coordinates) was assigned an arbitrary number (1-189) 20. "grouped_site_number_5km" = grouped site number, where all sampling sites that were within 5km of each other were considered a “grouped site” and were assigned an arbitrary grouped site number 21. "allele_imbalance" = percentage of heterozygote calls displaying allele balance ratios less than 0.3 22. "avg_depth_coverage" = average depth coverage across all linkage-pruned SNPs 23. "missingness" = proportion of sites with a missing genotype 24. "heterozygosity" = proportion of sites with a heterozygous genotype 25. "pop" = population label for making ADMIXTURE plots ("05_plot_PCAs_ADMIXTURE.R" script); [species]_[sampling state] 26. "admix_prop_k2" = admixture proportion for K=2 (i.e., species-level admixture) 27. "lec_blue_k2" = final assignment probability for "blue" genetic group (color assigned by CLUMPAK) that corresponds to *N. lecontei* ancestry across all 100 ADMIXTURE runs for K=2 28. "pine_orange_k2" = final assignment probability for "orange" genetic group (color assigned by CLUMPAK) that corresponds to *N. pinetum* ancestry across all 100 ADMIXTURE runs for K=2 29. "lec_blue_k3" = final assignment probability for "blue" genetic group (color assigned by CLUMPAK) that corresponds to *N. lecontei* ancestry across all 100 ADMIXTURE runs for K=3 30. "pine_orange_k3" = final assignment probability for "orange" genetic group (color assigned by CLUMPAK) that corresponds to *N. pinetum* ancestry across all 100 ADMIXTURE runs for K=3 31. "lec_purple_k3" = final assignment probability for "purple" genetic group (color assigned by CLUMPAK) that corresponds to *N. lecontei* ancestry across all 100 ADMIXTURE runs for K=3 32. "lec_blue_k4" = final assignment probability for "blue" genetic group (color assigned by CLUMPAK) that corresponds to *N. lecontei* ancestry across all 100 ADMIXTURE runs for K=4 33. "pine_orange_k4" = final assignment probability for "orange" genetic group (color assigned by CLUMPAK) that corresponds to *N. pinetum* ancestry across all 100 ADMIXTURE runs for K=4 34. "lec_purple_k4" = final assignment probability for "purple" genetic group (color assigned by CLUMPAK) that corresponds to *N. lecontei* ancestry across all 100 ADMIXTURE runs for K=4 35. "lec_green_k4" = final assignment probability for "green" genetic group (color assigned by CLUMPAK) that corresponds to *N. lecontei* ancestry across all 100 ADMIXTURE runs for K=4 Perform analyses: The first folder ("01_LandCover_data") contains the R script to calculate land cover proportions within a 5km-radius buffer around each site's GPS coordinates ("get_LandCover_props.R"). The required input land cover raster files to run this script are available in the public domain at ([https://www.mrlc.gov/viewer/](https://www.mrlc.gov/viewer/)) for the United States and ([http://www.cec.org/north-american-environmental-atlas/land-cover-30m-2020/](http://www.cec.org/north-american-environmental-atlas/land-cover-30m-2020/)) for Canada. This folder also contains the input files ("for_lcPCA.csv", "latlong.csv") and script ("LandCover_PCA.R") necessary to complete the PCA of land cover proportions for each of the 189 sampling sites. Specifically, the "for_lcPCA.csv" file contains, for each sampling site, the proportions of each land cover class within a 5km radius buffer around the GPS coordinates of the sampling site; the "latlong.csv" file contains the GPS coordinates (latitude and longitude) of each sampling site. The second folder ("02_temperature_data") contains the input file ("LatLong.csv") and script ("get_tmax.R") necessary to extract the average of daily maximum temperature for each month within each year (1980-1989 and 2010-2019) for the latitude/longitude that each individual sawfly was collected at. The required input temperature raster files to run this script are available in the public domain at ([https://daac.ornl.gov/cgi-bin/dsviewer.pl?ds_id=2131](https://daac.ornl.gov/cgi-bin/dsviewer.pl?ds_id=2131)) for the United States and Canada. The third folder ("03_ADMIXTURE") contains the script ("run_admixture.sh") to perform 100 ADMIXTURE runs for each K (1-10). The output file from CLUMPAK for K=2 is also provided ("ClumppIndFile.output"). In this text file, each row represents an individual; the proportion of the individual's ancestry originating from each of the two inferred ancestral populations are provided (last two columns), from which admixture proportions can be extracted. The script to perform the next analysis (PCA) is provided ("04_run_pcas.sh"). Next, the script to plot the results from the ADMIXTURE and PCA analyses above is provided ("05_plot_PCAs_ADMIXTURE.R"). The next folder ("06_ABBA-BABA") contains the scripts (".sh") and input files (".txt" and ".nwk") to perform the ABBA-BABA analysis for the "North" and "Central" geographic regions separately. For each region, the *"*.txt" file indicates which of the four taxa each individual belongs to (i.e., P~1~, P~2~, P~3~, or Outgroup) - individuals that are in the vcf file but excluded from the analysis are indicated with an "xxx". The ".nwk" file is a text file that represents the phylogenetic tree of the four taxa above in the Newick format (i.e., uses parentheses and commas to show the relationships between taxa). Finally, the script to perform the multiple linear regression models and plot the model results is provided ("07_run_models.R"). ## Code/software All scripts to perform analyses are described above. All software and R package versions are described in the Materials and Methods section of the manuscript."]} 
    more » « less
  3. ABSTRACT Hybridization and interspecific gene flow play a substantial role in the evolution of plant taxa. The eastern North American white oak syngameon, a group of approximately 15 ecologically, morphologically and genomically distinguishable species, has long been recognised as a model system for studying introgressive hybridization in temperate trees. However, the prevalence, genomic context and environmental correlates of introgression in this system remain largely unknown. To assess introgression in the eastern North American white oak syngameon and population structure within the widespreadQuercus macrocarpa, we conducted a rangewide survey ofQ. macrocarpaand four sympatric eastern North American white oak species. Using a Hyb‐Seq approach, we assembled a dataset of 3412 thinned single‐nucleotide polymorphisms (SNPs) in 445 enriched target loci including 62 genes putatively associated with various ecological functions, as well as associated intronic regions and some off‐target intergenic regions (not associated with the exons). Admixture analysis and hybrid class inference demonstrated species coherence despite hybridization and introgressive gene flow (due to backcrossing of F1s to one or both parents). Additionally, we recovered a genetic structure withinQ. macrocarpaassociated with latitude. Generalised linear mixed models (GLMMs) indicate that proximity to range edge predicts interspecific admixture, but rates of genetic differentiation do not appear to vary between putative functional gene classes. Our study suggests that gene flow between eastern North American white oak species may not be as rampant as previously assumed and that hybridization is most strongly predicted by proximity to a species' range margin. 
    more » « less
  4. ABSTRACT ObjectivesMost human brains exhibit left hemisphere asymmetry for planum temporale (PT) surface area and gray matter volume, which is interpreted as cerebral lateralization for language. Once considered a uniquely human feature, PT asymmetries have now been documented in chimpanzees and olive baboons. The goal of the current study was to further investigate the evolution of PT asymmetries in nonhuman primates. Materials and MethodsWe measured PT surface area in chimpanzees (Pan troglodytes,n = 90), bonobos (Pan paniscus,n = 21), gorillas (Gorilla gorilla,n = 34), orangutans (Pongospp.,n = 33), olive baboons (Papio anubis,n = 105), rhesus macaques (Macaca mulatta,n = 144), and tufted capuchins (Sapajus apella,n = 29) from magnetic resonance imaging scans. ResultsOur findings reveal significant leftward biases in PT surface area among chimpanzees, gorillas, olive baboons, rhesus macaques, and capuchins. We did not find significant population‐level asymmetries among orangutans and bonobos, which could be due, in part, to small sample sizes. We also detected significant age effects for rhesus macaques only, and no significant sex effects for any species. DiscussionThe observation of a population‐level leftward bias for PT surface area among not only hominids (chimpanzees and gorillas), but also two cercopithecoids (olive baboons and rhesus macaques) and one platyrrhine (tufted capuchins) suggests that PT lateralization was likely present in some early anthropoid primate ancestors and relatives. This provides further evidence that human brains have since undergone changes to the size and connectivity of the PT in response to selection for the cognitive processes needed to support the evolution of language and speech. 
    more » « less
  5. ABSTRACT Coalescent modelling of hybrid zones can provide novel insights into the historical demography of populations, including divergence times, population sizes, introgression proportions, migration rates and the timing of hybrid zone formation. We used coalescent analysis to determine whether the hybrid zone between phylogeographic lineages of the Plateau Fence Lizard (Sceloporus tristichus) in Arizona formed recently due to human‐induced landscape changes, or if it originated during Pleistocene climatic shifts. Given the presence of mitochondrial DNA from another species in the hybrid zone (Southwestern Fence Lizard,S. cowlesi), we tested for the presence ofS. cowlesinuclear DNA in the hybrid zone as well as reassessed the species boundary betweenS. tristichusandS. cowlesi. No evidence ofS. cowlesinuclear DNA is found in the hybrid zone, and the paraphyly of both species raises concerns about their taxonomic validity. Introgression analysis placed the divergence time between the parental hybrid zone populations at approximately 140 kya and their secondary contact and hybridization at approximately 11 kya at the end of the Pleistocene. Introgression proportions estimated for hybrid populations are correlated with their geographic distance from parental populations. The multispecies coalescent with migration provided significant support for unidirectional migration moving from south to north, which is consistent with spatial cline analyses that suggest a slow but steady northward shift of the centre of the hybrid zone over the last two decades. When analysing hybrid populations sampled along a linear transect, coalescent methods can provide novel insights into hybrid zone dynamics. 
    more » « less