<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Recent Climate Change and Historical Population Structure Predict Spatial Patterns of Admixture Between Two Host‐Specialised Pine Sawfly Species</title></titleStmt>
			<publicationStmt>
				<publisher>Wiley</publisher>
				<date>12/01/2025</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10668763</idno>
					<idno type="doi">10.1111/mec.70183</idno>
					<title level='j'>Molecular Ecology</title>
<idno>0962-1083</idno>
<biblScope unit="volume">34</biblScope>
<biblScope unit="issue">24</biblScope>					

					<author>Ashleigh N Glover</author><author>Catherine R Linnen</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[<title>ABSTRACT</title> <p>Human disturbance can have profound effects on biodiversity, including increasing hybridization between reproductively isolated species. One approach for understanding how human activity affects hybridization dynamics is to evaluate correlations between disturbance (e.g., urbanisation, temperature change) and hybridization. Because variation in hybridization can also arise from historical factors unrelated to recent human disturbance, it is essential to account for population structure to avoid spurious correlations. Here, we combine environmental and high‐coverage whole‐genome resequencing data to investigate how human disturbance and population structure affect hybridization dynamics between a pair of pine sawflies adapted to different pines,<styled-content style='fixed-case'><italic>Neodiprion lecontei</italic></styled-content>and<styled-content style='fixed-case'><italic>Neodiprion pinetum</italic></styled-content>. We find that<styled-content style='fixed-case'><italic>N. lecontei</italic></styled-content>and<styled-content style='fixed-case'><italic>N. pinetum</italic></styled-content>exhibit strikingly different patterns of population structure, which we hypothesise stemfrom differences in host use. We also find that recent admixture is both asymmetric and geographically variable. Linear regression analyses reveal that admixture proportion is predicted by indirect human disturbance (i.e., climate change) and not direct human disturbance (e.g., urbanisation) in both<styled-content style='fixed-case'><italic>N. lecontei</italic></styled-content>and<styled-content style='fixed-case'><italic>N. pinetum</italic></styled-content>. Lastly, in<styled-content style='fixed-case'><italic>N. pinetum</italic></styled-content>, we find evidence of a spurious association between admixture and direct human disturbance that disappears when regression models account for population structure via inclusion of genetic principal component scores as covariates. Together, our data suggest that indirect human disturbance and population structure both contribute to geographic variation in admixture between<styled-content style='fixed-case'><italic>N. lecontei</italic></styled-content>and<styled-content style='fixed-case'><italic>N. pinetum</italic></styled-content>. Our study also highlights the importance of adequately controlling for population structure when attempting to identify environmental predictors (human disturbance‐related or not) of hybridization.</p>]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>INTRODUCTION</head><p>Humans are rapidly modifying the planet, both directly (e.g., construction, cropland expansion, atmospheric light pollution; <ref type="bibr">Wilson et al. 2021;</ref><ref type="bibr">Hernandez and Suni 2022;</ref><ref type="bibr">Kreiner et al. 2022;</ref><ref type="bibr">Grabenstein and Taylor 2018)</ref> and indirectly (i.e., climate change; <ref type="bibr">Garroway et al. 2010;</ref><ref type="bibr">IPCC, 2007)</ref>. Both direct and indirect human disturbance can have profound eRects on genetic and phenotypic variation within and between species. For example, novel selection pressures via urbanization have resulted in phenotypic and genomic changes across taxa <ref type="bibr">(Benmazouz et al. 2023;</ref><ref type="bibr">Winchell et al. 2023;</ref><ref type="bibr">Johnson and Munshi-South 2017;</ref><ref type="bibr">Kreiner et al. 2022)</ref> while climate change has been linked to higher mortality in sheep and fishes <ref type="bibr">(Hallett et al. 2004;</ref><ref type="bibr">Finney et al. 2000)</ref> and variation in reproductive success in marine birds <ref type="bibr">(Croxall et al. 2002;</ref><ref type="bibr">Horreo et al. 2011)</ref>. Humanmediated landscape changes and climate change can also alter species' abundances and distributions and facilitate hybridization between reproductively isolated species <ref type="bibr">(Garroway et al. 2010;</ref><ref type="bibr">De Meester et al. 2018;</ref><ref type="bibr">Grabenstein and Taylor 2018;</ref><ref type="bibr">Grabenstein et al. 2023)</ref>. For species that naturally co-occur, human disturbance can promote gene exchange via three non-mutually exclusive mechanisms: (1) breaking down physical or ecological barriers to gene flow by altering habitat structure and/or phenology <ref type="bibr">(Grabenstein and Taylor 2018;</ref><ref type="bibr">Grabenstein et al. 2023</ref>); (2) disrupting assortative mating by interfering with visual, chemical, and acoustic mate-recognition cues <ref type="bibr">(Fisher et al. 2006;</ref><ref type="bibr">Grabenstein and Taylor 2018)</ref>; and (3) creating novel environments with relaxed selection against hybrids, increasing hybrid survival <ref type="bibr">(Seifert et al. 2010;</ref><ref type="bibr">Hasselman et al. 2014;</ref><ref type="bibr">Grabenstein et al. 2023)</ref>. Overall, human disturbance can promote gene exchange by reducing the strength of prezygotic and/or postzygotic isolation, although disturbances that reduce prezygotic reproductive isolating barriers are most likely to promote hybridization <ref type="bibr">(Grabenstein and Taylor 2018)</ref>. Importantly, any localized changes in selection pressures and/or expression of traits facilitating prezygotic isolation (e.g., via these mechanisms) can cause variation in hybridization across geographic space and time <ref type="bibr">(Hasselman et al. 2014;</ref><ref type="bibr">Sianta et al. 2024)</ref>.</p><p>Understanding hybridization dynamics between taxa is critical because hybridization, which occurs in an estimated 10% of animal species and 25% of plant species, can have important evolutionary consequences <ref type="bibr">(Walsh et al. 2023;</ref><ref type="bibr">Aguillon et al. 2022)</ref>. When introgression occurs via hybridization and backcrossing, the evolutionary trajectories of the hybridizing taxa can be altered drastically. For example, extensive admixture can result in the formation of a new hybrid species, thereby increasing biodiversity, or in the extinction of one of the parental species via genetic swamping, thereby decreasing biodiversity <ref type="bibr">(DeVos et al. 2023;</ref><ref type="bibr">Aguillon et al. 2022)</ref>. Introgression can also increase genetic diversity within the recipient species, potentially increasing its adaptive potential <ref type="bibr">(Hasselman et al. 2014;</ref><ref type="bibr">Norris et al. 2015;</ref><ref type="bibr">Oziolor et al. 2019)</ref>.</p><p>Despite the increased recognition that the environment can play a critical role in shaping geographical patterns of hybridization, the mechanisms responsible for generating variation in hybridization dynamics remain unknown in most hybridizing taxa. Additionally, our understanding of the consequences of such hybridization is largely incomplete <ref type="bibr">(Mandeville et al. 2022;</ref><ref type="bibr">Grabenstein et al. 2023;</ref><ref type="bibr">Sianta et al. 2024)</ref>. Characterizing patterns and establishing correlations between human disturbance and hybridization is a necessary first step for generating hypotheses to guide future work that explores the mechanisms that promote or prevent hybridization <ref type="bibr">(Grabenstein et al. 2023)</ref>. Relationships between indirect human disturbance (climate change) and hybridization have been recovered in a variety of taxa (e.g., <ref type="bibr">Garroway et al. 2010;</ref><ref type="bibr">Muhlfeld et al. 2014</ref><ref type="bibr">Muhlfeld et al. , 2017;;</ref><ref type="bibr">Sianta et al. 2024)</ref>, but relatively few studies have been explicitly designed to explore relationships between direct human disturbance and hybridization <ref type="bibr">(Grabenstein et al. 2023)</ref>. Even fewer studies test for relationships between hybridization and both direct and indirect human disturbance simultaneously (but see <ref type="bibr">Ortego et al. 2017;</ref><ref type="bibr">Muhlfeld et al. 2017)</ref>.</p><p>When describing geographical patterns of hybridization and introgression, it is critical to account for population structure. In addition to the environment, population structure within species can produce geographic variation in hybridization and introgression. For example, local adaptation and/or geographic variation in historical overlap/isolation can cause geographic variation in the strength of reproductive isolating barriers <ref type="bibr">(Borge et al. 2005;</ref><ref type="bibr">Poikela et al. 2019;</ref><ref type="bibr">Glover et al. 2023)</ref>, producing variable patterns of hybridization and introgression <ref type="bibr">(Walsh et al. 2016;</ref><ref type="bibr">Sianta et al. 2024;</ref><ref type="bibr">Faske et al. 2024)</ref>. Any recent changes in hybridization due to human disturbance are overlaid on these historical processes. Therefore, not accounting for population structure can produce spurious associations between measures of hybridization and human disturbance. The confounding eRect of population structure is well recognized in genome-wide association studies (GWAS; Young 2024) and genome-environment association (GEA) studies <ref type="bibr">(Forester et al. 2018)</ref>. A common approach to account for population structure in these studies is to quantify population structure using principal component analysis (PCA) and include genetic PC scores as covariates in subsequent models testing for associations between genotype and phenotype/environment <ref type="bibr">(Price et al. 2006;</ref><ref type="bibr">Forester et al. 2018;</ref><ref type="bibr">Priv&#233; et al. 2020)</ref>. Another approach used to account for population structure is to include a proxy for population structure (e.g., sampling site; kinship matrix) as a random eRect in models <ref type="bibr">(Liu et al. 2011;</ref><ref type="bibr">Reutimann et al. 2023;</ref><ref type="bibr">Fachrul et al. 2023)</ref>.</p><p>These approaches can be extended to studies testing for relationships between hybridization and human disturbance, although few of these existing studies account for population structure within species (but see <ref type="bibr">Ortego et al. 2017;</ref><ref type="bibr">Muhlfeld et al. 2017;</ref><ref type="bibr">Grabenstein et al. 2023;</ref><ref type="bibr">DeVos et al. 2023)</ref>. For example, to investigate the impact of human disturbance on hybridization between Californian Quercus oak species, <ref type="bibr">Ortego et al. (2017)</ref> included latitude and longitude as model covariates while <ref type="bibr">Grabenstein et al. (2023)</ref> included sampling site as a random eRect in their model to test for a correlation between hybridization and human disturbance in Poecile chickadees. Notably, the eRicacy of diRerent models (i.e., covariates versus random eRects) at preventing spurious associations between hybridization and human disturbance remains unexplored (but see <ref type="bibr">Ortego et al. 2017</ref>). Overall, investigating population structure provides important context for interpreting patterns of hybridization and introgression <ref type="bibr">(Sianta et al. 2024)</ref>.</p><p>Disentangling the relative roles of recent human disturbance and historical population structure in shaping patterns of hybridization and introgression will enhance our understanding of how taxa have and will continue to respond to human disturbance and shed light on how humans impact biodiversity. To fill this knowledge gap, we investigated the influence of human disturbance and population structure on hybridization dynamics between a sister-species pair of pine sawflies, Neodiprion lecontei and Neodiprion pinetum.</p><p>Like all Neodiprion (Order: Hymenoptera; Family: Diprionidae), N. lecontei and N. pinetum feed on conifers and are dependent upon their host plant for all stages of their life cycle <ref type="bibr">(Coppel and Benjamin 1965;</ref><ref type="bibr">Knerer and Atwood 1973;</ref><ref type="bibr">Davis et al. 2023)</ref>. These sister species have large, mostly overlapping ranges in eastern North America but are adapted to Pinus species with diRerent needle morphologies <ref type="bibr">(Linnen and</ref><ref type="bibr">Farrell 2008, 2010;</ref><ref type="bibr">Glover et al. 2023)</ref>. Neodiprion pinetum specializes on the thin-needled white pine (Pinus strobus) and is the only species in the genus to prefer this thin-needled host while N. lecontei avoids white pine and uses other Pinus species that have thicker needles <ref type="bibr">(Wilson et al. 1992;</ref><ref type="bibr">Linnen and Farrell 2010;</ref><ref type="bibr">Bendall et al. 2017)</ref>. In addition to this divergent host preference, N. lecontei and N. pinetum also diRer in adult body size, female ovipositor morphology and other egg-laying traits <ref type="bibr">(Bendall et al. 2017;</ref><ref type="bibr">Glover et al. 2023</ref>). These divergent host-use traits contribute to both prezygotic and postzygotic reproductive isolation <ref type="bibr">(Bendall et al. 2017</ref><ref type="bibr">(Bendall et al. , 2023;;</ref><ref type="bibr">Glover et al. 2023)</ref>. Notably, no-choice mating assays suggest that the strength of prezygotic isolation varies across geographic space <ref type="bibr">(Glover et al. 2023)</ref>. Despite strong reproductive isolation between N. lecontei and N. pinetum, these species occasionally hybridize in nature and viable, fertile hybrids for both sexes and cross directions can be produced in the lab <ref type="bibr">(Bendall et al. 2017</ref><ref type="bibr">(Bendall et al. , 2022</ref><ref type="bibr">(Bendall et al. , 2023))</ref>. Finally, historical hybridization has probably occurred between N. lecontei and N. pinetum: a recent demographic analysis suggests that they diverged in sympatry with continuous and asymmetric gene flow <ref type="bibr">(Bendall et al. 2022</ref>).</p><p>Here, we combine environmental and high-coverage whole-genome resequencing data to describe patterns of hybridization between N. lecontei and N. pinetum and explore relationships between human disturbance (direct and indirect) and hybridization while investigating the confounding eRect of population structure. If disturbance alters aspects of a species' environment that are important for maintaining reproductive isolating barriers, then individuals in more disturbed areas will have a higher proportion of their genome introgressed from the species with which they hybridize (e.g., <ref type="bibr">Ortego et al. 2017;</ref><ref type="bibr">Coppi et al. 2020)</ref>. For example, direct human disturbance (e.g., urbanization) creates patchy host plant distributions in areas where the hosts are native, potentially making it more diRicult for insects with limited dispersal capabilities to find a conspecific mate <ref type="bibr">(Lopez et al. 2018)</ref>.</p><p>Urbanization also often involves ornamental planting of non-native host plants, potentially bringing insect species specialized on these host plants together in an area where they would not naturally co-occur <ref type="bibr">(Lopez et al. 2018)</ref>. Notably, white pine (N. pinetum's host) is one of the most widely planted pines in the eastern United States, often ornamentally planted around homes and businesses <ref type="bibr">(Wendel and Smith 1990;</ref><ref type="bibr">Dickerson 2002)</ref>. Both habitat fragmentation and ornamental planting could increase hybridization rates.</p><p>Therefore, we predict that N. lecontei and N. pinetum individuals in areas that have more urban and agricultural land cover will have a higher proportion of their genome introgressed. Additionally, indirect human disturbance (i.e., climate change) includes temperature changes. As temperature is important for insect development, faster development time in one or both species can decrease temporal isolation, allowing more opportunities to hybridize <ref type="bibr">(Larson et al. 2019)</ref>. Therefore, we predict that N. lecontei and N. pinetum individuals in areas that have experienced a higher increase in temperature will have a higher proportion of their genome introgressed.</p><p>Finally, to investigate the confounding eRect of population structure, we first describe population structure in N. lecontei and N. pinetum. Previous work with ddRAD data in N. lecontei revealed distinct genetic clusters that were dubbed "North", "Central", and "South" based on corresponding geographic regions in eastern North America <ref type="bibr">(Bagley et al. 2017)</ref>. Demographic modeling suggests these cluster were produced by historical isolation in three diRerent pine refugia, with the North cluster particularly distinct (i.e., lowest heterozygosity; <ref type="bibr">Bagley et al. 2017)</ref>. Subsequent work in N. lecontei detected an additional East/West genetic break in the Central cluster <ref type="bibr">(Lindstedt et al. 2022)</ref>. Here, we report the first analysis of population structure in N. pinetum and the first analysis of population structure in N. lecontei with whole-genome resequencing (WGS) data. Importantly, to compare population structure between N. lecontei and N. pinetum, having similar data (i.e., WGS) is ideal. Using results from these analyses, we fit models that diRer in the approach used to account for population structure and assess how these approaches aRect inferred relationships between human disturbance and hybridization.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>MATERIALS AND METHODS</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Sampling, DNA extraction and sequencing</head><p>We sampled Neodiprion lecontei and N. pinetum mid-to late-instar larval colonies from regions where the species co-occur across eastern North America (Figure <ref type="figure">1</ref>; Table <ref type="table">S1</ref>). Because we required an allopatric population of N. lecontei for an analysis of historical introgression (see below), we also sampled a Florida population of N. lecontei. Larvae were either immediately preserved in 100% ethanol (stored at -20&#176;C) or reared to adults in the lab using standard lab protocols <ref type="bibr">(Bendall et al. 2017;</ref><ref type="bibr">Harper et al. 2016</ref>) and subsequently preserved in 100% ethanol (stored at -20&#176;C). To avoid sampling closely related individuals, we selected one individual from each colony, as each colony typically represents a group of siblings. Neodiprion pine sawflies are haplodiploid: males develop from unfertilized eggs and are haploid across their entire genome while females develop from fertilized eggs and are diploid across their entire genome <ref type="bibr">(Bendall et al. 2022;</ref><ref type="bibr">Glover et al. 2024)</ref>. Thus, to ensure that our samples were all diploid, we selected one adult female from each colony when available. From colonies where no adult females were available, we selected the largest larva, as relatively larger larvae are usually diploid <ref type="bibr">(Bagley et al. 2017)</ref>. Due to sample availability and to ensure we sampled a broad range of intraspecific variation in the more variable species (N. lecontei), we sampled 314 N. lecontei individuals (78 adult females and 236 larvae) and 94 N. pinetum individuals (56 adult females and 38 larvae).</p><p>Notably, although sample size aRects power in regression analyses, N. lecontei and N. pinetum exhibited similar regression patterns (see Results below), suggesting that sample size diRerences between the species did not aRect our main conclusions.</p><p>We extracted DNA from either head and thorax tissue (adult females) or head and thoracic leg tissue (larvae) with a Qiagen DNeasy Blood &amp; Tissue Kit following the standard manufacturer protocol, including an optional RNase A step. We measured DNA concentration with a Quant-iT dsDNA High-Sensitivity fluorescence assay (Invitrogen).</p><p>Extracted DNA was sent to Admera Health (Plainfield, NJ) for library preparation and wholegenome resequencing. A separate library was prepared for each individual with a KAPA Hyper Prep Min PCR kit and whole-genome resequencing was performed with 150 bp paired-end sequencing technology in an Illumina NovaSeq X plus sequencer. To reach approximately 20x coverage per individual, all libraries were sequenced in two batches.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Quality filtering and variant calling</head><p>Demultiplexed reads were cleaned using trimmomatic v0.39 <ref type="bibr">(Bolger et al. 2014)</ref> to remove Illumina adapters, low quality bases (quality score &lt; 3) from the beginning and end of each read, any window (width = 4 bp) that had an average quality score below 15, and any individual read with a total length of less than 36 bp. For each batch separately, we mapped the cleaned reads to the chromosome-level N. lecontei reference genome (iyNeoLeco1.1, GCA_021901455.1; <ref type="bibr">Herrig et al. 2024</ref>) using the BWA-MEM algorithm in bwa v0.7.17 <ref type="bibr">(Li and Durbin 2009)</ref>. We also included previously generated whole-genome resequencing reads from an individual of an outgroup species (Neodiprion hetricki; <ref type="bibr">Herrig et al. 2024</ref>) so that we could quantify historical introgression (see below). We then used samtools v1.13 <ref type="bibr">(Danecek et al. 2021;</ref><ref type="bibr">Li et al. 2009</ref>) to mark and remove PCR duplicates and remove ambiguously mapped reads/secondary alignments. Finally, we used samtools to merge the filtered reads from the two sequencing batches (for N. lecontei and N. pinetum) and index the resulting bam files.</p><p>We called all sites (variant and invariant) using bcftools mpileup v1.13 <ref type="bibr">(Danecek et al. 2021)</ref>. We chose bcftools mpileup because it has been shown to outperform GATK for non-human data <ref type="bibr">(Lefouili and Nam 2022)</ref>. We then used a custom script from Mark Ravinet (<ref type="url">https://github.com/markravinet/genotyping_pipeline/blob/main/3_filter_variants.nf</ref>) to remove indels. We performed additional filtering with vcftools v1.16 <ref type="bibr">(Danecek et al. 2011)</ref> to create two high-quality datasets for downstream analyses. The first dataset ("SNPs only") was restricted to N. lecontei and N. pinetum only and included only biallelic SNPs present in at least 90% of individuals. We also required retained SNPs to have a minimum site quality score of 20, mean site depth between 10-40X, genotype depth between 10-60X, and a minor allele frequency (MAF) above 0.05. The second dataset ("all sites") included N. lecontei, N. pinetum, and N. hetricki. This dataset was filtered with the same missingness, site quality, and depth criteria as above but included all sites (i.e., invariant and variant) with no MAF filtering.</p><p>To check for irregularities indicative of sample contamination (as part of our standard data quality control pipeline) or putative haploid males (for larvae), we used a custom script from David Marques and Joana Meier (<ref type="url">https://github.com/speciationgenomics/scripts/blob/master/checkHetsIndvVCF.sh</ref>) to calculate and visualize allelic imbalance for all heterozygote calls (i.e., allele skew) within the "SNPs only" dataset. In a diploid individual with minimal contamination, heterozygous positions contain roughly equal numbers of reads for both alleles (Figure <ref type="figure">S1a</ref>). However, diploid individuals with high levels of contamination and putative haploid males will exhibit strong allelic imbalance (Figures <ref type="figure">S1b-S1v</ref>)-males should exhibit zero heterozygosity as they are haploid across their entire genome. We excluded any sample for which more than 49% of heterozygote calls displayed allele balance ratios less than 0.3. This cutoR was determined based on data visualization and a threshold that struck a balance between stringency and retaining adequate data. This resulted in exclusion of 12 larvae that were putatively haploid males and 9 individuals with high levels of contamination (Figures <ref type="figure">S1b-S1v</ref>; Table <ref type="table">S1</ref>). For both datasets ("SNPs only" and "all sites"), we filtered out any low levels of contamination with a custom script by Joana Meier (<ref type="url">https://github.com/joanam/scripts/blob/master/allelicBalance.py</ref>) by converting heterozygote genotypes with allele balance ratios less than 0.2 to homozygous for the more common allele. Notably, our results were not impacted by converting imbalanced heterozygote genotypes to homozygous for the more common allele rather than to missing (Figure <ref type="figure">S2</ref>). Also, our measure of hybridization (admixture proportions, see below) did not correlate with allele skew or any other genotype quality metrics (Figure <ref type="figure">S3</ref>; Table <ref type="table">S2</ref>). We then used vcftools v1.16 <ref type="bibr">(Danecek et al. 2011)</ref> to refilter the resulting vcf files with the same criteria above while simultaneously excluding the 21 putative haploid male/highly contaminated individuals. This resulted in a final dataset of 387 individuals (299 N. lecontei and 88 N. pinetum; Table <ref type="table">S1</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Population structure and genomic introgression</head><p>Prior to population structure analyses, we performed linkage pruning on the "SNPs only" dataset using vcftools v1.16 <ref type="bibr">(Danecek et al. 2011)</ref>. We required SNPs to be at least 1,000 bp apart because previous analyses indicate linkage disequilibrium decays to approximately zero at this distance <ref type="bibr">(Glover et al. 2024)</ref>. We then used PLINK v1.9 <ref type="bibr">(Chang et al. 2015)</ref> to perform three principal component analyses (PCAs): both species combined, within N. lecontei only, and within N. pinetum. These scripts and scripts for all subsequent analyses are deposited on DRYAD <ref type="bibr">(Glover and Linnen 2025)</ref>. Next, we used ADMIXTURE v1.3 <ref type="bibr">(Alexander et al. 2009</ref>) with both species combined to further investigate population structure using a model-based approach and to estimate individual admixture proportions.</p><p>We set the number of assumed populations (K) between 1 and 10 and ran 100 iterations for each value of K. The best supported number of populations (K) was based on the lowest averaged five-fold cross-validation (CV) error across all 100 iterations per K. Because we were interested in species-level admixture for downstream analyses, we estimated admixture proportions using assignment probabilities for K=2 even though this was not the best supported K (see below). Final assignment probabilities (i.e., admixture proportion) for each individual at K=2 were estimated across all 100 iterations using the LargeKGreedy algorithm for 2,000 repetitions as implemented in CLUMPAK <ref type="bibr">(Kopelman et al. 2015)</ref>.</p><p>We consider estimated admixture proportions to be a measure of recent introgression because the ADMIXTURE model assumes modern individuals were produced by recent admixture between K ancestral populations <ref type="bibr">(Lawson et al. 2018</ref>). However, a critical consideration when interpreting admixture proportions is that some demographic scenarios-such as isolation by distance (IBD)-can give rise to spurious evidence of discrete population structure with "admixed" individuals in the center of the range <ref type="bibr">(Meirmans 2012;</ref><ref type="bibr">Wiens and Colella 2025)</ref>. For this reason, it is important to rule out IBD before interpreting admixture proportions as evidence of hybridization <ref type="bibr">(Wiens and Colella 2025)</ref>. Fortunately, several observations from prior studies reject a model of IBD for N. lecontei and N. pinetum: (1) these two species have largely overlapping ranges and can often be found in the same locations (vs. living at opposite ends of a continuous geographic range as expected under IBD) (Linnen and Farrell 2010); (2) they form discrete and highly diRerentiated clusters even in tight sympatry (e.g., FST = 0.61 in Lexington, KY; Glover et al.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>2024); and</head><p>(3) demographic modeling suggests these species diverged with continuous gene exchange, with an estimated divergence time of ~1.5 million generations ago <ref type="bibr">(Bendall et al. 2022)</ref>. Additionally, preliminary analyses with hybrid indices <ref type="bibr">(Wiens et al. 2025</ref>), an alternate metric for hybridization, indicated they are highly sensitive to the choice of reference populations for N. lecontei and N. pinetum and to genotyping error. In contrast, admixture proportions did not require specification of reference populations and were not correlated with any genotyping error metrics, including allele skew, read count, coverage depth, and missing data (Figure <ref type="figure">S3</ref>; Table <ref type="table">S2</ref>). For these reasons, we used assignment probabilities (i.e., admixture proportion) as our estimate of species-level admixture for downstream analyses.</p><p>To further investigate regional patterns of introgression, we used Dsuite v0.5 <ref type="bibr">(Malinsky et al. 2021)</ref> to calculate Patterson's D (ABBA-BABA) statistic <ref type="bibr">(Green et al. 2010;</ref><ref type="bibr">Durand et al. 2011</ref>). This analysis considers ancestral ("A") and derived ("B") alleles at each site for three populations and an outgroup defined as (((P1, P2), P3), O) and tests for an excess of "ABBA" or "BABA" patterns <ref type="bibr">(Durand et al. 2011;</ref><ref type="bibr">Martin et al. 2014)</ref>. A D-statistic that is positive and significantly diRerent from zero (i.e., excess of "ABBA") indicates introgression between P2 and P3. Conversely, a D-statistic that is negative and significantly diRerent from zero (i.e., excess of "BABA") indicates introgression between P1 and P3.</p><p>Notably, the D-statistic is well documented to detect ancient introgression (McFarlane and Pemberton 2019; <ref type="bibr">Dagilis et al. 2022</ref>). However, because the D-statistic also captures recent introgression <ref type="bibr">(Dagilis et al. 2022)</ref>, additional analyses are required to date the timing of introgression (e.g., <ref type="bibr">Hibbins and Hahn 2019)</ref>. This analysis was performed with the "all sites" dataset, which included allopatric N. lecontei (P1) and N. hetricki (the outgroup).</p><p>Because our PCA and ADMIXTURE analyses revealed a distinct genetic break within N. lecontei that corresponded to "North" and "Central" geographic regions (see below; Bagley </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Quantifying human disturbance</head><p>To quantify direct human disturbance, we first downloaded 2021 land cover GIS layers for the eastern United States produced by the National Land Cover Database <ref type="bibr">(Dewitz 2021</ref>; <ref type="url">https://www.mrlc.gov/viewer/</ref>) and a 2020 land cover GIS layer for Canada produced by the Canada Centre for Remote Sensing (Commission for Environmental Cooperation (CEC) 2023; <ref type="url">http://www.cec.org/north-american-environmental-atlas/land-  cover-30m-2020/</ref>). These layers provide land cover data at a 30m x 30m spatial resolution.</p><p>Next, for each sampling site, we used the sf v1.0-16 package <ref type="bibr">(Pebesma 2018;</ref><ref type="bibr">Pebesma and Bivand 2023</ref>) in R v4.3.3 (R Core Team 2024) to create a 5km-radius buRer around the GPS coordinates. We chose this distance because it encompasses the estimated maximum dispersal distance of individual Neodiprion males <ref type="bibr">(&#214;strand et al. 2001)</ref>. Then, we used the terra v1.7-71 package <ref type="bibr">(Hijmans 2024)</ref> to calculate the proportion of each land cover class within each 5km-radius buRer. Finally, to reduce all land cover classes-which are correlated to varying degrees-to a composite measure of disturbance at each sampling site, we used the princomp() function in the stats v4.3.3 package (R Core Team 2024) to perform a PCA of the land cover proportion data for each sampling site. Prior to performing the PCA, for the United States sampling sites, some land cover classes were grouped together (proportions added together) to be comparable with the Canadian land cover classes: all "Developed" land cover classes (i.e., "Developed Open", "Developed Low Intensity", "Developed Medium Intensity", and "Developed High Intensity") were grouped as "Urban", "Pasture and Hay" and "Cultured Crop" were grouped as "Cropland", and "Woody Wetland" and "Emergent Herbaceous Wetland" were grouped as "Wetland". PC1 separated disturbed sites (i.e., higher urban and agricultural land cover; negative PC values) from "natural" sites (i.e., more forest, shrubland and grassland land cover; positive PC values) while PC2 separated low vegetation sites (i.e., more barren, water and wetland land cover; positive PC values) from high vegetation (of various types) sites (i.e., negative PC values; Figure <ref type="figure">2</ref>).</p><p>To quantify indirect human disturbance (i.e., climate change), we first downloaded monthly average daily maximum temperature (&#176;C) GIS layers for North America produced by NASA <ref type="bibr">(Thornton et al. 2022</ref>; <ref type="url">https://daac.ornl.gov/cgi-bin/dsviewer.pl?ds_id=2131</ref>) for the years 1980-1989 and 2010-2019. These layers provide temperature data at a 1km x 1km spatial resolution. Next, for each sampling site, we used the terra v1.7-71 package </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Statistical analyses</head><p>To explore relationships between admixture proportion and human disturbance (direct and indirect) and investigate how the approach used to account for population structure within species aRects these relationships, we performed multiple linear regression with the glmmTMB v1.1.9 package <ref type="bibr">(Brooks et al. 2017</ref>) in R v4.3.3 (R Core Team 2024). Due to asymmetric gene flow between N. lecontei and N. pinetum <ref type="bibr">(Bendall et al. 2022)</ref>, we modeled each species separately. For N. pinetum, we excluded one individual that was originally classified as N. pinetum at the time of collection (because it was collected on white pine) but was later identified by PCA and ADMIXTURE analyses as an F1 hybrid (see below). For all models, we used the beta family with a logit link and a zeroinflation formula to model the distribution of a response variable that was proportion data with zeros present in the dataset. Prior to running each model, all predictor variables were normal-quantile transformed to ensure that they were on the same scale.</p><p>Because we were interested in testing for relationships between admixture proportion and human disturbance while simultaneously exploring the potentially confounding eRect of population structure, we fit three diRerent models for each species separately. For all models, admixture proportion was the response variable and land cover PC1, land cover PC2 and change in average daily maximum temperature (Tmax) were included as predictor variables. However, each model diRered in the approach used to account for population structure. In the first model, we included quantitative measures of population structure-specifically PC1 and PC2 scores from within-species genetic PCAs As a second approach to account for population structure, we included "grouped site number" as a random eRect, where all sampling sites that were within 5km of each other were considered a "grouped site" and were assigned an arbitrary grouped site number. With this approach, population structure was not directly quantified. Instead, grouped sites were assumed to capture the genetic variation across geographic space (i.e., correspond to population structure) as individuals sampled in close geographic proximity tend to be more genetically similar <ref type="bibr">(Bradburd et al. 2016</ref>). Thus, including grouped site number as a random eRect allowed us to test for relationships between admixture proportion and measures of human disturbance while accounting for variability across sites (i.e., population structure). Specifically, for each species, the model took the following form: (M2) Admixture proportion ~ land cover PC1 + land cover PC2 + change in Tmax + (1|grouped site number) Finally, we tested for relationships between admixture proportion and measures of human disturbance without including a control for population structure (i.e., no covariates or random eRects were included). For each species, this third model took the form: (M3) Admixture proportion ~ land cover PC1 + land cover PC2 + change in Tmax</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RESULTS</head><p>After mapping trimmed reads to the N. lecontei reference genome and filtering, we retained 46,957,406 variant and invariant sites ("all sites" dataset) and 207,049 linkagepruned SNPs (from the "SNPs only" dataset). Average depth coverage across all retained individuals (299 N. lecontei, 87 N. pinetum, 1 F1 hybrid, and 1 N. hetricki) was 20.8x (range: 12.5x -29.2x).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Patterns of population structure diQer between Neodiprion lecontei and N. pinetum</head><p>When both species were included, the PCA clearly separated N. lecontei and N. pinetum and revealed diRerent patterns of population structure within N. lecontei and N. pinetum (Figure <ref type="figure">3a</ref>). PC1 separated N. lecontei and N. pinetum and identified one F1 hybrid (PC1 = 61.4% variance explained). While N. pinetum individuals were more tightly clustered together, PC2 separated N. lecontei into distinct clusters (PC2 = 17.4% variance explained). The PCA of only N. lecontei individuals recapitulated the distinct genetic breaks within N. lecontei that generally corresponded to geography (Figure <ref type="figure">3b</ref>). PC1 separated N. lecontei from northern US states and Canada ("North"; positive side of PC1 axis) and N. lecontei from central US states ("Central"; negative side of PC1 axis) (PC1 = 59.4% variance explained). PC2 further separated Central N. lecontei into two groups: individuals collected in northeastern US states and states east of the Appalachian Mountains (positive side of PC2 axis) and individuals collected in states west of the Appalachian Mountains (negative side of PC2 axis), although this split was much less pronounced than the North/Central split (PC2 = 9.76% variance explained). In contrast to N. lecontei, the PCA of only N. pinetum individuals revealed less discrete genetic clustering (Figure <ref type="figure">3c</ref>). Rather, individuals were distributed more continuously across both PC1 (56.4% variance explained) and PC2 (12.2% variance explained).</p><p>The ADMIXTURE analysis revealed similar patterns of population structure as the PCAs. The best supported number of populations was K=3, although K=4 was also a reasonable solution (Figure <ref type="figure">S4</ref>). Regardless of K, all N. pinetum individuals formed one genetic group (Figures <ref type="figure">3d-e</ref>, <ref type="figure">S5</ref>). At K=3, the remaining two genetic groups represented North and Central N. lecontei (Figure <ref type="figure">3e</ref>). At K=4, the Central N. lecontei group was further split into eastern and western groups with extensive admixture between these groups (Figure <ref type="figure">S5</ref>). This analysis also detected the single F1 hybrid that was originally classified as N. pinetum at the time of collection (Figures <ref type="figure">3d-e</ref>, <ref type="figure">S5</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Patterns of admixture diQer between geographic regions</head><p>Admixture proportions were low overall in both species (Figure <ref type="figure">3d</ref>). Admixture proportions ranged from 0.0 to 0.019 in N. lecontei and from 0.0 to 0.05 in N. pinetum. For both species, admixture proportions varied among geographic regions (Figure <ref type="figure">4</ref>). Admixture proportions were lower in the North (N. lecontei: mean 0.0 &#177; 0.0 SE; N. pinetum: mean 0.0 &#177; 0.0 SE) compared to the Central region (N. lecontei: mean 2.92 x 10 -3 &#177; 3.45 x 10 -4 SE; N. pinetum: mean 6.37 x 10 -3 &#177; 1.50 x 10 -3 SE). In contrast, estimates of Patterson's D statistic revealed evidence of introgression between N. lecontei and N. pinetum in both the North and Central regions (Figure <ref type="figure">4</ref>). Patterson's D statistic (D) was significantly diRerent from zero and positive for both regions, being lower in the Central (D = 0.019, p = 2.35 x 10 -11 ) compared to the North (D = 0.046, p = 7.61 x 10 -4 ).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Hybridization consistently correlates with indirect human disturbance</head><p>We used multiple linear regression to explore whether admixture proportion correlated with direct and indirect human disturbance and whether these correlations were aRected by the approach used to account for population structure. When genetic PC scores were included as model covariates (M1), admixture proportion significantly correlated with at least one genetic PC axis in each species (Table <ref type="table">1</ref>). Additionally, for both species, change in average daily maximum temperature (indirect human disturbance) was the only measure of human disturbance to significantly correlate with admixture proportion; the relationship was positive in both species (Figures <ref type="figure">5a</ref>, <ref type="figure">b</ref>; Table <ref type="table">1</ref>). When we included grouped site as a random eRect to account for population structure (M2) or did not include a control for population structure (M3), only change in average daily maximum temperature significantly and positively correlated with admixture proportion in N. lecontei (Figures <ref type="figure">5c-f</ref>; Table <ref type="table">1</ref>). In N. pinetum, both change in average daily maximum temperature and land cover PC1 (direct human disturbance) significantly correlated with admixture proportion, with all relationships being positive (Figures <ref type="figure">5c-f</ref>; Table <ref type="table">1</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>DISCUSSION</head><p>In this study, we used environmental data and high-coverage whole-genome resequencing data to describe patterns of hybridization between N. lecontei and N. pinetum and explore relationships between human disturbance and admixture while investigating how the approach used to account for population structure aRected these relationships. We find that N. lecontei and N. pinetum exhibit strikingly diRerent patterns of population structure (Figure <ref type="figure">3</ref>) and that patterns of admixture vary among geographic regions and, potentially, between recent and historical time points (Figure <ref type="figure">4</ref>). Additionally, we find that admixture proportion is consistently predicted by indirect human disturbance (i.e., change in average daily maximum temperature) and not direct human disturbance in both N. lecontei and N. pinetum (Figure <ref type="figure">5</ref>; Table <ref type="table">1</ref>). Lastly, in N. pinetum, we find evidence of a spurious association between admixture proportion and direct human disturbance (i.e., land cover PC1). Here, we first discuss patterns of population structure in N. lecontei and N. pinetum. Then, we discuss two non-mutually exclusive mechanisms that could explain how increasing temperatures have aRected hybridization dynamics between N. lecontei and N. pinetum. Finally, we highlight how knowledge of population structure provides important context for interpreting patterns of admixture, oRering a more complete picture of hybridization dynamics between taxa, and discuss the implications of model structure when testing for relationships between admixture and human disturbance.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Patterns of population structure diQer between N. lecontei and N. pinetum</head><p>Neodiprion lecontei and N. pinetum exhibit strikingly diRerent patterns of population structure (Figure <ref type="figure">3</ref>). Our whole-genome resequencing analysis of population structure within N. lecontei recapitulates patterns previously described using ddRAD data <ref type="bibr">(Bagley et al. 2017;</ref><ref type="bibr">Lindstedt et al. 2022)</ref>. Demographic modeling suggests that the three distinct genetic clusters within N. lecontei (North, Central, and South) likely formed when populations were isolated in three glacial Pinus refugia within the US: one southwestern refugium, one mid-Atlantic refugium, and one northeastern refugium <ref type="bibr">(Bagley et al. 2017)</ref>.</p><p>Additionally, each of the three primary N. lecontei lineages is currently associated with a diRerent assemblage of Pinus species that diRer in needle morphology <ref type="bibr">(Bagley et al. 2017;</ref><ref type="bibr">Glover et al. 2023)</ref>. Notably, adaptation to diRerent Pinus species has been shown to produce rapid genetic and phenotypic diRerentiation in N. lecontei, even at a single site with no geographic isolation <ref type="bibr">(Bagley et al. 2023)</ref>. Thus, genetic divergence within N. lecontei was probably the result of genetic drift due to historical isolation (e.g., <ref type="bibr">Qvarnstr&#246;m et al. 2016;</ref><ref type="bibr">Lozier et al. 2023</ref>) and adaptation to diRerent local hosts and climates (e.g., <ref type="bibr">Nelson et al. 2022;</ref><ref type="bibr">MacDonald et al. 2022)</ref>.</p><p>In contrast to N. lecontei, N. pinetum does not show any evidence of discrete genetic breaks that might be attributable to historical isolation (Figures <ref type="figure">3c-e</ref>, <ref type="figure">S5</ref>). This finding is consistent with analyses of genetic data for Pinus strobus (the only host of N. pinetum) that suggest that this pine species occurred in a single mid-Atlantic pine refugium (overlapping with N. lecontei) during the last glacial maximum <ref type="bibr">(Zinck and Rajora 2016)</ref>.</p><p>Overall, we hypothesize that observed diRerences in patterns of population structure between N. lecontei and N. pinetum are likely attributable to (1) diRerences in available Pinus refugia during the Pleistocene, and (2) diRerent opportunities for divergent selection and host-associated diRerentiation for generalist (N. lecontei) vs. specialist (N. pinetum) Pinus feeders.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Potential mechanisms for correlations between increased temperature and hybridization</head><p>DiRerences in population structure aside, we found that admixture proportions for both species were positively correlated with temperature change, and this finding was robust to the approach used to account for population structure (Table <ref type="table">1</ref>). One potential mechanism for increased hybridization due to increasing temperatures is via changes in adult sawfly body size. In general, as ecotherms, temperature has a strong eRect on insect growth rate and on the rate of biochemical reactions <ref type="bibr">(Davidowitz and Nijhout 2004;</ref><ref type="bibr">Chown and Gaston 2010)</ref>. Higher temperatures often reduce total development time such that, even within a single species, individuals in warmer areas are often smaller than individuals in colder areas (reviewed in <ref type="bibr">Chown and Gaston 2010)</ref>. Within Neodiprion, females tend to be larger than males. Additionally, previous work demonstrated that N. lecontei females and males tend to be larger than N. pinetum females and males and that both species exhibit size-assortative mating: pairs that are more similar in size are more likely to mate <ref type="bibr">(Glover et al. 2023</ref>). Thus, it is possible that in areas that have experienced a larger increase in temperature, N. lecontei and/or N. pinetum adult body sizes have decreased, increasing their willingness to hybridize. Due to widespread geographic variation in body size in nature <ref type="bibr">(Ashton 2002;</ref><ref type="bibr">Chown and Gaston 2010;</ref><ref type="bibr">Stillwell 2010;</ref><ref type="bibr">Terribile et al. 2009;</ref><ref type="bibr">Auteri 2022)</ref> and widespread size-assortative mating across taxa (e.g., <ref type="bibr">Bearhop et al. 2005;</ref><ref type="bibr">Jones et al. 2003;</ref><ref type="bibr">Greenway et al. 2016;</ref><ref type="bibr">Rougemont et al. 2015)</ref>, experimental studies investigating the role of increasing temperatures in body size changes and subsequent hybridization dynamics across taxa would be a fruitful avenue for future research. These studies could also shed light on the role of plasticity compared to genetic adaptation to climate change, an understudied topic in global change research <ref type="bibr">(Chown and Gaston 2010;</ref><ref type="bibr">De Meester et al. 2018)</ref>.</p><p>A second potential mechanism for increased hybridization due to increasing temperatures is via changes in phenology. Multiple ecological (e.g., synchronization with host) and physiological (e.g., development time) processes can lead to phenological divergence between naturally co-occurring species, causing temporal isolation between the species <ref type="bibr">(Nelson et al. 2022)</ref>. However, increasing temperatures can change the context of species interactions by potentially eroding temporal isolation: within insects, higher temperatures can accelerate growth rate <ref type="bibr">(Chown and Gaston 2010)</ref> and/or alter seasonal timing of adult emergence (e.g., <ref type="bibr">Buckley et al. 2015)</ref>, thereby increasing overlap of adults from each species and opportunities for hybridization <ref type="bibr">(Larson et al. 2019</ref>). Within N. lecontei, larval hatching and adult emergence from cocoons occur earlier when temperatures are higher <ref type="bibr">(Benjamin 1955</ref>). Thus, it is possible that overlap in N. lecontei and N. pinetum adult emergence has increased in areas that have experienced a higher increase in temperature, resulting in more hybridization and introgression. Indeed, the breakdown of temporal isolation due to climate change has been observed in a variety of taxa, including plants (e.g., <ref type="bibr">Rossetto et al. 2011;</ref><ref type="bibr">Theobald et al. 2017)</ref>, insects <ref type="bibr">(Taylor and</ref> Friesen 2017), and birds (e.g., <ref type="bibr">Bom et al. 2023)</ref>.</p><p>Importantly, increased hybridization due to temperature-related changes in adult body size and changes in phenology are not mutually exclusive processes. For example, increasing temperatures, and subsequently growth rates, can cause insects to increase the number of generations they have per year <ref type="bibr">(Tobin et al. 2008;</ref><ref type="bibr">Altermatt 2010;</ref><ref type="bibr">Larson et al. 2019)</ref>. This can impact both adult body size (e.g., result in smaller body size; <ref type="bibr">Larson et al. 2019</ref>) and phenology (e.g., reduce temporal isolation between species; <ref type="bibr">Dopman et al. 2010</ref>). Thus, it is possible that climate change has resulted in the breakdown of multiple prezygotic barriers to gene flow between N. lecontei and N. pinetum. However, ultimately, our inferences are based on correlative approaches and the specific mechanisms responsible for generating patterns of hybridization between N. lecontei and N. pinetum are unknown. Therefore, future manipulative experiments that investigate the temperature dependence of reproductive traits in Neodiprion and other taxa are necessary to enhance our understanding of the long-term maintenance of species boundaries in the face of rapid global change.</p><p>Our finding that indirect human disturbance, but not direct human disturbance, consistently predicted admixture proportion in N. lecontei and N. pinetum contrasts with other studies testing for relationships between human disturbance (direct and indirect) and hybridization. For example, <ref type="bibr">Ortego et al. (2017)</ref> found that wildfire frequency but not climatic variables (indirect disturbance) nor urbanization/land clearing for agriculture (direct human disturbance) predicted admixture in Californian Quercus oaks. Conversely, <ref type="bibr">Muhlfeld et al. (2017)</ref> found that both mean summer stream temperature (indirect human disturbance) and road density (direct human disturbance) predicted admixture in Oncorhynchus trout. Overall, this highlights the importance of evaluating human-mediated hybridization in diverse taxa.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>The importance of controlling for population structure when evaluating correlates of hybridization</head><p>Although change in average daily maximum temperature from the 1980's to the 2010's consistently predicted admixture proportion in N. lecontei and N. pinetum (Figure <ref type="figure">5</ref>; Table <ref type="table">1</ref>), increasing temperature is not the only factor aRecting hybridization dynamics between these two species. For example, although some sampling sites in the North region (e.g., Canada) have experienced relatively high temperature increases (Figure <ref type="figure">1</ref>), we found no evidence of recent admixture in either species (i.e., admixture proportions of zero for all individuals of both species; Figure <ref type="figure">4</ref>).</p><p>Even before climate change, population structure within species and variation in historical contact between species could give rise to geographic variation in reproductive isolating barriers (e.g., <ref type="bibr">Borge et al. 2005;</ref><ref type="bibr">Hoskin et al. 2005;</ref><ref type="bibr">Martin and Willis 2010;</ref><ref type="bibr">Matute et al. 2014;</ref><ref type="bibr">Poikela et al. 2019)</ref>, and any recent changes in hybridization due to direct and indirect human eRects are overlaid on these historical processes. For example, both species exhibit geographic clines in body size, but in opposite directions: whereas N. lecontei males and females tend to increase in size with latitude, N. pinetum adult body size is negatively correlated with latitude <ref type="bibr">(Glover et al. 2023</ref>). In N. lecontei, latitudinal clines in body size mirror latitudinal clines in host needle width, whereas in N. pinetum, body size clines are more likely driven by abiotic factors <ref type="bibr">(Glover et al. 2023)</ref>. Regardless of their selective causes, these opposing body size clines result in northern populations having greater body size disparities between N. lecontei and N. pinetum compared to populations at lower latitudes (hereafter "central" populations), possibly accounting for pronounced diRerences in admixture proportions between the two regions (Figure <ref type="figure">4</ref>). In further support of this hypothesis, crosses between northern N. pinetum and central N. lecontei populations (bigger body size diRerences) yielded stronger prezygotic isolation than crosses between central populations of each species (smaller body size diRerences) <ref type="bibr">(Glover et al. 2023)</ref>.</p><p>Geographic variation in reproductive isolation could also stem from spatial and temporal variation in range overlap between the two species. For example, temporary isolation of N. lecontei populations from N. pinetum in some glacial refugia (i.e., in the southwestern and northeastern refugia), but not others (e.g., the mid-Atlantic refugium), may have provided spatially variable opportunities to fix alleles contributing to postzygotic isolation <ref type="bibr">(Cutter 2012)</ref>. Likewise, opportunities for hybridization-whether in primary or secondary contact-could have strengthened prezygotic isolation via reinforcement <ref type="bibr">(Servedio and Noor 2003)</ref>. Previous work lends support to this hypothesis: using no-choice mating assays, <ref type="bibr">Glover et al. (2023)</ref> found that prezygotic isolation was stronger in two crosses that included N. lecontei and N. pinetum from areas where they co-occur compared to a cross involving allopatric populations of N. lecontei and N. pinetum, a pattern consistent with reinforcement <ref type="bibr">(Servedio and Noor 2003;</ref><ref type="bibr">Butlin and Smadja 2018)</ref>.</p><p>However, more experimental work is needed to fully characterize geographic variation in pre-and postzygotic isolation, and more modelling work is needed to quantify spatial and temporal variation in gene flow between N. lecontei and N. pinetum.</p><p>Whatever the cause, there is evidence of geographic variation in the strength of reproductive isolation <ref type="bibr">(Glover et al. 2023</ref>) and the amount of admixture (Figure <ref type="figure">4</ref>) between N. lecontei and N. pinetum that correlates to some degree with observed patterns of population structure (e.g., highest reproductive isolation and lowest admixture proportions correspond to the North N. lecontei cluster). Our finding that population structure likely correlates with admixture is consistent with several other studies (e.g., <ref type="bibr">Hamlin and Arnold 2014;</ref><ref type="bibr">Mandeville et al. 2015;</ref><ref type="bibr">Lewanski et al. 2022)</ref>, suggesting this may be a widespread phenomenon. Because population structure should also correlate with environmental variables, including our metrics of human disturbance, failure to account for this structure can give rise to spurious associations between human disturbance and admixture that have nothing to do with recent human activity.</p><p>To investigate the confounding eRect of population structure, we compared the results of three models that diRered in the approach used to control for population structure. Although our models consistently revealed a significant and positive relationship between indirect human disturbance (change in average daily maximum temperature) and admixture in both species (Figures <ref type="figure">5a</ref>, <ref type="figure">c</ref>, <ref type="figure">e</ref>; Table <ref type="table">1</ref>), our data hint at a spurious association between direct human disturbance (land cover PC1) and admixture in N. pinetum. When we included genetic PC scores as covariates in the model (M1), we did not recover a significant correlation between land cover PC1 and admixture proportion (Figure <ref type="figure">5b</ref>; Table <ref type="table">1</ref>). However, when we included grouped site number as a random eRect (M2) or did not include a control for population structure in the model (M3), we recovered a significant and positive correlation (Figures <ref type="figure">5d</ref>, <ref type="figure">f</ref>; Table <ref type="table">1</ref>). Further inspection of the data revealed that the significant relationships between land cover PC1 and admixture proportion in these models are probably driven by N. pinetum sampled from Virginia. These Virginia samples represent the most extreme PC2 values in the within-species PCA (Figure <ref type="figure">S6a</ref>).</p><p>Additionally, these individuals have the highest admixture proportions, and all Virginia sampling sites load on the highly positive land cover PC1 axis (Figures <ref type="figure">S6b-d</ref>). Therefore, it is likely that the significant relationships between land cover PC1 and admixture proportion for models M2 and M3 in N. pinetum are confounded with population structure. These results highlight that methods for accounting for population structure can aRect conclusions about relationships between hybridization and human disturbance.</p><p>Interestingly, it appears that including grouped site number as a random eRect (M2) is insuRicient to control for the confounding eRect of population structure in N. pinetum compared to including genetic PCs as model covariates (M1). Although grouped sites may capture possible relatedness among samples, they likely do not capture broader patterns of population structure such as discrete structure (e.g., via historical isolation) or isolation by distance. While the eRectiveness of diRerent modeling approaches to correct for population structure has been investigated in GWAS (e.g., <ref type="bibr">Price et al. 2010;</ref><ref type="bibr">Liu et al. 2011)</ref> and GEA studies (e.g., <ref type="bibr">Forester et al. 2018)</ref>, this topic remains unexplored in studies testing for relationships between hybridization and human disturbance (but see <ref type="bibr">Ortego et al. 2017)</ref>. Therefore, we suggest that studies investigating how correcting for population structure aRects relationships between hybridization and human disturbance under diRerent scenarios of population structure (e.g., with simulations) are a high priority.</p><p>Overall, the genetics of reproductive isolation, in addition to environmental context, can cause variation in hybridization across geographic space <ref type="bibr">(Sianta et al. 2024</ref>). Thus, accounting for variation within species is crucial to our understanding of human impacts on biodiversity.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>CONCLUSIONS</head><p>Our study documents geographic variation in recent admixture between two pine sawfly species that co-occur throughout much of eastern North America. Our study also adds to a growing body of literature implicating climate change in altering hybridization dynamics between species (e.g., <ref type="bibr">Garroway et al. 2010;</ref><ref type="bibr">Muhlfeld et al. 2014</ref><ref type="bibr">Muhlfeld et al. , 2017;;</ref><ref type="bibr">Sianta et al. 2024)</ref>. Moving forward, high priorities for future work in this tractable insect system are to (1) uncover the causal mechanisms that produce observed correlations between temperature change and admixture, and (2) evaluate the evolutionary consequences (e.g., adaptive introgression) of hybridization. Our study also highlights the importance of considering population structure within species for a more complete picture of the factors aRecting hybridization dynamics between species. Overall, accurately characterizing the patterns, causes, and consequences of human disturbance-mediated hybridization is critical for our ability to better predict how biodiversity will be aRected as our planet rapidly changes.</p><p>Table <ref type="table">1</ref>. Results from multiple linear regression testing for relationships between admixture proportion and human disturbance in Neodiprion lecontei and N. pinetum. All models used the beta family with a logit link and a zero-inflation formula. All predictor variables were normalquantile transformed prior to running each model to ensure that they were on the same scale. Each model di?ered in approach used to account for population structure within each species: genetic PC scores included as model covariates (M1), grouped site number included as a random e?ect (M2), or no control for population structure (M3). Abbreviations: lcPC1, land cover PC1; lcPC2, land cover PC2; CTmax, change in average daily maximum temperature; genPC1, genetic PC1; genPC2, genetic PC2. Significant p-values (p &lt; 0.05) are indicated in bold.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Model</head><p>Species Predictor Estimate SE z value p-value M1: Genetic PC scores included as covariates N. lecontei (Intercept) -4.60 0.22 -20.97 &lt; 2 x 10 -16 lcPC1 -0.028 0.082 -0.34 0.73 lcPC2 -0.097 0.052 -1.86 0.062 CTmax 0.19 0.063 3.06 2.21 x 10 -3 genPC1 1.24 0.34 3.59 3.26 x 10 -4 genPC2 -0.14 0.11 -1.35 0.18 N. pinetum (Intercept) -4.34 0.15 -28.75 &lt; 2 x 10 -16 lcPC1 0.011 0.10 0.11 0.91 lcPC2 -0.13 0.10 -1.24 0.22 CTmax 0.38 0.11 3.36 7.67 x 10 -4 genPC1 -0.25 0.18 -1.38 0.17 genPC2 0.52 0.089 5.81 6.26 x 10 -9 M2: Site number included as random effect N. lecontei (Intercept) -5.22 0.093 -56.31 &lt; 2 x 10 -16 lcPC1 0.079 0.078 1.02 0.31 lcPC2 -0.044 0.064 -0.69 0.49 CTmax 0.27 0.072 3.70 2.16 x 10 -4 N. pinetum (Intercept) -4.35 0.15 -28.39 &lt; 2 x 10 -16 lcPC1 0.27 0.12 2.24 0.025 lcPC2 -0.20 0.15 -1.34 0.18 CTmax 0.36 0.16 2.25 0.024 M3: No accounting for population structure N. lecontei (Intercept) -5.21 0.089 -58.73 &lt; 2 x 10 -16 lcPC1 0.060 0.071 0.84 0.40 lcPC2 -0.076 0.053 -1.43 0.15 CTmax 0.29 0.065 4.46 8.17 x 10 -6 N. pinetum (Intercept) -4.21 0.14 -30.77 &lt; 2 x 10 -16 lcPC1 0.30 0.095 3.11 1.86 x 10 -3 lcPC2 -0.21 0.12 -1.78 0.076 CTmax 0.37 0.13 2.80 5.12 x 10 -3 Figure 1. Sampling map of Neodiprion lecontei and N. pinetum. Map of sampling locations in eastern North America for N. lecontei and N. pinetum individuals used in this study. Each point represents a unique sampling site and is colored by the change in average daily maximum temperature from the 1980's to the 2010's (Change in Tmax) at that site (see Materials and Methods). Photographs of N. lecontei and N. pinetum larvae by Robin Bagley and Ryan Ridenbaugh. States, several land cover classes were grouped together (proportions added together) to be comparable with the Canadian land cover classes prior to performing the PCA: all four "Developed" land cover classes (see legend) were grouped as "Urban"; "Pasture Hay" and "Cultured Crops" were grouped as "Cropland"; and "Woody Wetlands" and "Emergent Herbaceous Wetlands" were grouped as "Wetland". PC1 separates disturbed sites from "natural" sites while PC2 separates low vegetation sites from high vegetation sites. Land cover GIS layers for four sampling sites are provided to highlight extremes on each PC axis: highly disturbed (Columbus, OH) vs. "natural" (Amelia, VA) sites on PC1 and low vegetation (Millersburg, MI) vs. high vegetation (Forge, VA) sites on PC2. Within each land cover GIS layer, the black circle represents the 5km-radius buRer around the sampling site GPS coordinates, within which land cover proportions were calculated.   When genetic PC scores were included as model covariates (a, b), only change in average daily maximum temperature (Tmax) significantly predicted admixture proportion in both species (see Table <ref type="table">1</ref>). When site number was included as a random eRect (c, d) and when no control for population structure was included (e, f), change in Tmax significantly predicted admixture proportion in both species and land cover PC1 significantly predicted admixture proportion in N. pinetum (see Table <ref type="table">1</ref>). Points represent individual sawflies.</p><p>Trendlines with shaded 95% confidence intervals show predicted admixture proportions from multiple linear regression. Note: Although model results for both species are plotted together to aid with visualization, each of the three models was fit within each species separately.</p></div></body>
		</text>
</TEI>
