skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 5:00 PM ET until 11:00 PM ET on Friday, June 21 due to maintenance. We apologize for the inconvenience.

Search for: All records

Creators/Authors contains: "Carstens, Bryan C."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    Previous work has demonstrated that there is extensive variation in the songs of White-crowned Sparrow (Zonotrichia leucophrys) throughout the species range, including between neighboring (and genetically distinct) subspecies Z. l. nuttalli and Z. l. pugetensis. Using a machine learning approach to bioacoustic analysis, we demonstrate that variation in song is correlated with year of recording (representing cultural drift), geographic distance, and climatic differences, but the response is subspecies- and season-specific. Automated machine learning methods of bird song annotation can process large datasets more efficiently, allowing us to examine 1,913 recordings across ~60 years. We utilize a recently published artificial neural network to automatically annotate White-crowned Sparrow vocalizations. By analyzing differences in syllable usage and composition, we recapitulate the known pattern where Z. l. nuttalli and Z. l. pugetensis have significantly different songs. Our results are consistent with the interpretation that these differences are caused by the changes in characteristics of syllables in the White-crowned Sparrow repertoire. This supports the hypothesis that the evolution of vocalization behavior is affected by the environment, in addition to population structure.

    more » « less
  2. Abstract

    Intraspecific genetic diversity is a key aspect of biodiversity. Quaternary climatic change and glaciation influenced intraspecific genetic diversity by promoting range shifts and population size change. However, the extent to which glaciation affected genetic diversity on a global scale is not well established. Here we quantify nucleotide diversity, a common metric of intraspecific genetic diversity, in more than 38,000 plant and animal species using georeferenced DNA sequences from millions of samples. Results demonstrate that tropical species contain significantly more intraspecific genetic diversity than nontropical species. To explore potential evolutionary processes that may have contributed to this pattern, we calculated summary statistics that measure population demographic change and detected significant correlations between these statistics and latitude. We find that nontropical species are more likely to deviate from neutral expectations, indicating that they have historically experienced dramatic fluctuations in population size likely associated with Pleistocene glacial cycles. By analyzing the most comprehensive data set to date, our results imply that Quaternary climate perturbations may be more important as a process driving the latitudinal gradient in species richness than previously appreciated.

    more » « less
  3. Staples, Anne Elizabeth (Ed.)
    Vocalizations in animals, particularly birds, are critically important behaviors that influence their reproductive fitness. While recordings of bioacoustic data have been captured and stored in collections for decades, the automated extraction of data from these recordings has only recently been facilitated by artificial intelligence methods. These have yet to be evaluated with respect to accuracy of different automation strategies and features. Here, we use a recently published machine learning framework to extract syllables from ten bird species ranging in their phylogenetic relatedness from 1 to 85 million years, to compare how phylogenetic relatedness influences accuracy. We also evaluate the utility of applying trained models to novel species. Our results indicate that model performance is best on conspecifics, with accuracy progressively decreasing as phylogenetic distance increases between taxa. However, we also find that the application of models trained on multiple distantly related species can improve the overall accuracy to levels near that of training and analyzing a model on the same species. When planning big-data bioacoustics studies, care must be taken in sample design to maximize sample size and minimize human labor without sacrificing accuracy. 
    more » « less
  4. Ruane, Sara (Ed.)
    Abstract Comparisons of intraspecific genetic diversity across species can reveal the roles of geography, ecology, and life history in shaping biodiversity. The wide availability of mitochondrial DNA (mtDNA) sequences in open-access databases makes this marker practical for conducting analyses across several species in a common framework, but patterns may not be representative of overall species diversity. Here, we gather new and existing mtDNA sequences and genome-wide nuclear data (genotyping-by-sequencing; GBS) for 30 North American squamate species sampled in the Southeastern and Southwestern United States. We estimated mtDNA nucleotide diversity for 2 mtDNA genes, COI (22 species alignments; average 16 sequences) and cytb (22 species; average 58 sequences), as well as nuclear heterozygosity and nucleotide diversity from GBS data for 118 individuals (30 species; 4 individuals and 6,820 to 44,309 loci per species). We showed that nuclear genomic diversity estimates were highly consistent across individuals for some species, while other species showed large differences depending on the locality sampled. Range size was positively correlated with both cytb diversity (phylogenetically independent contrasts: R2 = 0.31, P = 0.007) and GBS diversity (R2 = 0.21; P = 0.006), while other predictors differed across the top models for each dataset. Mitochondrial and nuclear diversity estimates were not correlated within species, although sampling differences in the data available made these datasets difficult to compare. Further study of mtDNA and nuclear diversity sampled across species’ ranges is needed to evaluate the roles of geography and life history in structuring diversity across a variety of taxonomic groups. 
    more » « less
  5. Research in the biological sciences is hampered by the Linnean shortfall, which describes the number of hidden species that are suspected of existing without formal species description. Using machine learning and species delimitation methods, we built a predictive model that incorporates some 5.0 × 10 5 data points for 117 species traits, 3.3 × 10 6 occurrence records, and 9.1 × 10 5 gene sequences from 4,310 recognized species of mammals. Delimitation results suggest that there are hundreds of undescribed species in class Mammalia. Predictive modeling indicates that most of these hidden species will be found in small-bodied taxa with large ranges characterized by high variability in temperature and precipitation. As demonstrated by a quantitative analysis of the literature, such taxa have long been the focus of taxonomic research. This analysis supports taxonomic hypotheses regarding where undescribed diversity is likely to be found and highlights the need for investment in taxonomic research to overcome the Linnean shortfall. 
    more » « less
  6. Hugall et al. (2022) is one of the seminal publications from the single locus era of phylogeographic research. These authors were among the first to argue that genetic data are ideally suited to test hypotheses that are ultimately derived from other sources of information. While the testing of predictions from the fossil record has long been important to molecular systematics (e.g., Donoghue et al., 1989), phylogeographic investigations into the more recent evolutionary past lack a fossil record in most focal taxa. In lieu of fossils, which were not available for the small snails that served as the focal taxon, Hugall et al. (2002) applied the (then) new technique of environmental modelling to identify regions within the species range with habitat that was predicted to be stable throughout the Holocene. They then present data that suggests that these regions correspond to the areas with high genetic diversity. Apart from the inferences about snail evolutionary history, the core argument of Hugall et al. (2002) is that consilience (i.e., agreement between inferences drawn from different sources of data) is an important goal for phylogeographic investigation. Consilience in the inferences drawn from independent types of data has a multiplicative effect; when present the researcher is likely to have more confidence in their inference than would be possible to have in an inference from any one source of data. The manuscript by Jaynes et al. (2022) is a splendid illustration of this principle.

    more » « less
  7. Phylogenetic estimation under the multispecies coalescent model (MSCM) assumes all incongruence among loci is caused by incomplete lineage sorting. Therefore, applying the MSCM to datasets that contain incongruence that is caused by other processes, such as gene flow, can lead to biased phylogeny estimates. To identify possible bias when using the MSCM, we present P2C2M.SNAPP. P2C2M.SNAPP is an R package that identifies model violations using posterior predictive simulation. P2C2M.SNAPP uses the posterior distribution of species trees output by the software package SNAPP to simulate posterior predictive datasets under the MSCM, and then uses summary statistics to compare either the empirical data or the posterior distribution to the posterior predictive distribution to identify model violations. In simulation testing, P2C2M.SNAPP correctly classified up to 83% of datasets (depending on the summary statistic used) as to whether or not they violated the MSCM model. P2C2M.SNAPP represents a user-friendly way for researchers to perform posterior predictive model checks when using the popular SNAPP phylogenetic estimation program. It is freely available as an R package, along with additional program details and tutorials. 
    more » « less
  8. Abstract Aim

    Species adapt differently to contrasting environments, such as open habitats with sparse vegetation and forested habitats with dense forest cover. We investigated colonization patterns in the open and forested environments in the diagonal of open formations and surrounding rain forests (i.e. Amazonia and Atlantic Forest) in Brazil, tested whether the diversification rates were affected by the environmental conditions and identified traits that enabled species to persist in those environments.


    South America, Brazil.


    Squamata, Lizards.


    We used phylogenetic information and the current distribution of species in open and forested habitats to estimate ancestral ranges and identify range shifts relative to the current habitats. To evaluate whether these environments influenced species diversification, we tested 12 models using a Hidden Geographic State Speciation and Extinction analysis. Finally, we combined phylogenetic relatedness and species traits in a machine learning framework to identify the traits permitting adaptation in those contrasting environments.


    We identified 41 total transitions between open and forested habitats, of which 80% were from the forested habitats to the open habitats. Widely distributed species had higher speciation, turnover, extinction, and extinction fraction rates than species in forested or open habitats, but had also the lower net diversification rate. Mean body temperature, microhabitat, female snout–vent length and diet were identified as putative traits that enabled adaptation to different environments, and phylogenetic relatedness was an important predictor of species occurrence.

    Main conclusions

    Transitions from forested to open habitats are most common, highlighting the importance of habitat shift in current patterns of biodiversity. The combination of phylogenetic reconstruction of ancestral distributions and the machine learning framework enables us to integrate organismal trait data, environmental data and evolutionary history in a manner that could be applied on a global scale.

    more » « less
  9. Abstract

    Pleistocene glacial cycles drastically changed the distributions of taxa endemic to temperate rainforests in the Pacific Northwest, with many experiencing reduced habitat suitability during glacial periods. In this study, we investigate whether glacial cycles promoted intraspecific divergence and whether subsequent range changes led to secondary contact and gene flow. For seven invertebrate species endemic to the PNW, we estimated species distribution models (SDMs) and projected them onto current and historical climate conditions to assess how habitat suitability changed during glacial cycles. Using single nucleotide polymorphism (SNP) data from these species, we assessed population genetic structure and used a machine‐learning approach to compare models with and without gene flow between populations upon secondary contact after the last glacial maximum (LGM). Finally, we estimated divergence times and rates of gene flow between populations. SDMs suggest that there was less suitable habitat in the North Cascades and Northern Rocky Mountains during glacial compared to interglacial periods, resulting in reduced habitat suitability and increased habitat fragmentation during the LGM. Our genomic data identify population structure in all taxa, and support gene flow upon secondary contact in five of the seven taxa. Parameter estimates suggest that population divergences date to the later Pleistocene for most populations. Our results support a role of refugial dynamics in driving intraspecific divergence in the Cascades Range. In these invertebrates, population structure often does not correspond to current biogeographic or environmental barriers. Rather, population structure may reflect refugial lineages that have since expanded their ranges, often leading to secondary contact between once isolated lineages.

    more » « less