Abstract Natural history collections (NHC) provide a wealth of information that can be used to understand the impacts of global change on biodiversity. As such, there is growing interest in using NHC data to estimate changes in species' distributions and abundance trends over historic time horizons when contemporary survey data are limited or unavailable.However, museum specimens were not collected with the purpose of estimating population trends and thus can exhibit spatiotemporal and collector‐specific biases that can impose severe limitations to using NHC data for evaluating population trajectories.Here we review the challenges associated with using museum records to track long‐term insect population trends, including spatiotemporal biases in sampling effort and sparse temporal coverage within and across years. We highlight recent methodological advancements that aim to overcome these challenges and discuss emerging research opportunities.Specifically, we examine the potential of integrating museum records and other contemporary data sources (e.g. collected via structured, designed surveys and opportunistic citizen science programs) in a unified analytical framework that accounts for the sampling biases associated with each data source. The emerging field of integrated modelling provides a promising framework for leveraging the wealth of collections data to accurately estimate long‐term trends of insect populations and identify cases where that is not possible using existing data sources.
more »
« less
A Double machine learning trend model for citizen science data
Abstract Citizen and community science datasets are typically collected using flexible protocols. These protocols enable large volumes of data to be collected globally every year; however, the consequence is that these protocols typically lack the structure necessary to maintain consistent sampling across years. This can result in complex and pronounced interannual changes in the observation process, which can complicate the estimation of population trends because population changes over time are confounded with changes in the observation process.Here we describe a novel modelling approach designed to estimate spatially explicit species population trends while controlling for the interannual confounding common in citizen science data. The approach is based on Double machine learning, a statistical framework that uses machine learning (ML) methods to estimate population change and the propensity scores used to adjust for confounding discovered in the data. ML makes it possible to use large sets of features to control for confounding and to model spatial heterogeneity in trends. Additionally, we present a simulation method to identify and adjust for residual confounding missed by the propensity scores.To illustrate the approach, we estimated species trends using data from the citizen science project eBird. We used a simulation study to assess the ability of the method to estimate spatially varying trends when faced with realistic confounding and temporal correlation. Results demonstrated the ability to distinguish between spatially constant and spatially varying trends. There were low error rates on the estimated direction of population change (increasing/decreasing) at each location and high correlations on the estimated magnitude of population change.The ability to estimate spatially explicit trends while accounting for confounding inherent in citizen science data has the potential to fill important information gaps, helping to estimate population trends for species and/or regions lacking rigorous monitoring data.
more »
« less
- PAR ID:
- 10434078
- Publisher / Repository:
- Wiley-Blackwell
- Date Published:
- Journal Name:
- Methods in Ecology and Evolution
- Volume:
- 14
- Issue:
- 9
- ISSN:
- 2041-210X
- Format(s):
- Medium: X Size: p. 2435-2448
- Size(s):
- p. 2435-2448
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Environmental and anthropogenic factors affect the population dynamics of migratory species throughout their annual cycles. However, identifying the spatiotemporal drivers of migratory species' abundances is difficult because of extensive gaps in monitoring data. The collection of unstructured opportunistic data by volunteer (citizen science) networks provides a solution to address data gaps for locations and time periods during which structured, design‐based data are difficult or impossible to collect.To estimate population abundance and distribution at broad spatiotemporal extents, we developed an integrated model that incorporates unstructured data during time periods and spatial locations when structured data are unavailable. We validated our approach through simulations and then applied the framework to the eastern North American migratory population of monarch butterflies during their spring breeding period in eastern Texas. Spring climate conditions have been identified as a key driver of monarch population sizes during subsequent summer and winter periods. However, low monarch densities during the spring combined with very few design‐based surveys in the region have limited the ability to isolate effects of spring weather variables on monarchs.Simulation results confirmed the ability of our integrated model to accurately and precisely estimate abundance indices and the effects of covariates during locations and time periods in which structured sampling are lacking. In our case study, we combined opportunistic monarch observations during the spring migration and breeding period with structured data from the summer Midwestern breeding grounds. Our model revealed a nonstationary relationship between weather conditions and local monarch abundance during the spring, driven by spatially varying vegetation and temperature conditions.Data for widespread and migratory species are often fragmented across multiple monitoring programs, potentially requiring the use of both structured and unstructured data sources to obtain complete geographic coverage. Our integrated model can estimate population abundance at broad spatiotemporal extents despite structured data gaps during the annual cycle by leveraging opportunistic data.more » « less
-
Abstract Data deficiencies among rare or cryptic species preclude assessment of community‐level processes using many existing approaches, limiting our understanding of the trends and stressors for large numbers of species. Yet evaluating the dynamics of whole communities, not just common or charismatic species, is critical to understanding and the responses of biodiversity to ongoing environmental pressures.A recent surge in both public science and government‐funded data collection efforts has led to a wealth of biodiversity data. However, these data collection programmes use a wide range of sampling protocols (from unstructured, opportunistic observations of wildlife to well‐structured, design‐based programmes) and record information at a variety of spatiotemporal scales. As a result, available biodiversity data vary substantially in quantity and information content, which must be carefully reconciled for meaningful ecological analysis.Hierarchical modelling, including single‐species integrated models and hierarchical community models, has improved our ability to assess and predict biodiversity trends and processes. Here, we highlight the emerging ‘integrated community modelling’ framework that combines both data integration and community modelling to improve inferences on species‐ and community‐level dynamics.We illustrate the framework with a series of worked examples. Our three case studies demonstrate how integrated community models can be used to extend the geographic scope when evaluating species distributions and community‐level richness patterns; discern population and community trends over time; and estimate demographic rates and population growth for communities of sympatric species. We implemented these worked examples using multiple software methods through the R platform via packages with formula‐based interfaces and through development of custom code in JAGS, NIMBLE and Stan.Integrated community models provide an exciting approach to model biological and observational processes for multiple species using multiple data types and sources simultaneously, thus accounting for uncertainty and sampling error within a unified framework. By leveraging the combined benefits of both data integration and community modelling, integrated community models can produce valuable information about both common and rare species as well as community‐level dynamics, allowing for holistic evaluation of the effects of global change on biodiversity.more » « less
-
Abstract Population size is a key metric for management and policy decisions, yet wildlife monitoring programmes are often limited by the spatial and temporal scope of surveys. In these cases, citizen science data may provide complementary information at higher resolution and greater extent.We present a case study demonstrating how data from the eBird citizen science programme can be combined with regional monitoring efforts by the US Fish and Wildlife Service to produce high‐resolution estimates of golden eagle abundance. We developed a model that uses aerial survey data from the western United States to calibrate high‐resolution annual estimates of relative abundance from eBird. Using this model, we compared regional population size estimates based on the calibrated eBird information with those based on aerial survey data alone.Population size estimates based on the calibrated eBird information had strong correspondence to estimates from aerial survey data in two out of four regions, and population trajectories based on the two approaches showed high correlations.We demonstrate how the combination of citizen science data and targeted surveys can be used to (a) increase the spatial resolution of population size estimates, (b) extend the spatial extent of inference and (c) predict population size beyond the temporal period of surveys. Findings based on this case study can be used to refine policy metrics used by the US Fish and Wildlife Service and inform permitting regulations (e.g. mortality/harm associated with wind energy development).Policy implications: Our results demonstrate the ability of citizen science data to complement targeted monitoring programmes and improve the efficacy of decision frameworks that require information on population size or trajectory. After validating citizen science data against survey‐based benchmarks, agencies can harness strengths of citizen science data to supplement information needs and increase the resolution and extent of population size predictions.more » « less
-
Abstract The timing of avian migration has evolved to exploit critical seasonal resources, yet plasticity within phenological responses may allow adjustments to interannual resource phenology. The diversity of migratory species and changes in underlying resources in response to climate change make it challenging to generalize these relationships.We use bird banding records during spring and fall migration from across North America to examine macroscale phenological responses to interannual fluctuations in temperature and long‐term annual trends in phenology.In total, we examine 19 species of North American wood warblers (family Parulidae), summarizing migration timing from 2,826,588 banded birds from 1961 to 2018 across 46 sites during spring and 124 sites during fall.During spring, warmer spring temperatures at banding locations translated to earlier median passage dates for 16 of 19 species, with an average 0.65‐day advancement in median passage for every 1°C increase in temperature, ranging from 0.25 to 1.26 days °C−1. During the fall, relationships were considerably weaker, with only 3 of 19 species showing a relationship with temperature. In those three cases, later departure dates were associated with warmer fall periods. Projecting these trends forward under climate scenarios of temperature change, we forecast continued spring advancements under shared socioeconomic pathways from 2041 to 2060 and 2081 to 2100 and more muted and variable shifts for fall.These results demonstrate the capacity of long‐distance migrants to respond to interannual fluctuations in temperatures, at least during the spring, and showcase the potential of North American bird banding data understanding phenological trends across a wide diversity of avian species.more » « less
An official website of the United States government
