Abstract Integrated community models—an emerging framework in which multiple data sources for multiple species are analyzed simultaneously—offer opportunities to expand inferences beyond the single‐species and single‐data‐source approaches common in ecology. We developed a novel integrated community model that combines distance sampling and single‐visit count data; within the model, information is shared among data sources (via a joint likelihood) and species (via a random‐effects structure) to estimate abundance patterns across a community. Parameters relating to abundance are shared between data sources, and the model can specify either shared or separate observation processes for each data source. Simulations demonstrated that the model provided unbiased estimates of abundance and detection parameters even when detection probabilities varied between the data types. The integrated community model also provided more accurate and more precise parameter estimates than alternative single‐species and single‐data‐source models in many instances. We applied the model to a community of 11 herbivore species in the Masai Mara National Reserve, Kenya, and found considerable interspecific variation in response to local wildlife management practices: Five species showed higher abundances in a region with passive conservation enforcement (median across species: 4.5× higher), three species showed higher abundances in a region with active conservation enforcement (median: 3.9× higher), and the remaining three species showed no abundance differences between the two regions. Furthermore, the community average of abundance was slightly higher in the region with active conservation enforcement but not definitively so (posterior mean: higher by 0.20 animals; 95% credible interval: 1.43 fewer animals, 1.86 more animals). Our integrated community modeling framework has the potential to expand the scope of inference over space, time, and levels of biological organization, but practitioners should carefully evaluate whether model assumptions are met in their systems and whether data integration is valuable for their applications.
more »
« less
Integrating distance sampling and presence‐only data to estimate species abundance
Integrated models combine multiple data types within a unified analysis to estimate species abundance and covariate effects. By sharing biological parameters, integrated models improve the accuracy and precision of estimates compared to separate analyses of individual data sets. We developed an integrated point process model to combine presence-only and distance sampling data for estimation of spatially explicit abundance patterns. Simulations across a range of parameter values demonstrate that our model can recover estimates of biological covariates, but parameter accuracy and precision varied with the quantity of each data type. We applied our model to a case study of black-backed jackals in the Masai Mara National Reserve, Kenya, to examine effects of spatially varying covariates on jackal abundance patterns. The model revealed that jackals were positively affected by anthropogenic disturbance on the landscape, with highest abundance estimated along the Reserve border near human activity. We found minimal effects of landscape cover, lion density, and distance to water source, suggesting that human use of the Reserve may be the biggest driver of jackal abundance patterns. Our integrated model expands the scope of ecological inference by taking advantage of widely available presence-only data, while simultaneously leveraging richer, but typically limited, distance sampling data.
more »
« less
- PAR ID:
- 10206413
- Date Published:
- Journal Name:
- Ecology
- ISSN:
- 0012-9658
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Monitoring wildlife abundance across space and time is an essential task to study their population dynamics and inform effective management. Acoustic recording units are a promising technology for efficiently monitoring bird populations and communities. While current acoustic data models provide information on the presence/absence of individual species, new approaches are needed to monitor population abundance, ideally across large spatio‐temporal regions.We present an integrated modelling framework that combines high‐quality but temporally sparse bird point count survey data with acoustic recordings. Our models account for imperfect detection in both data types and false positive errors in the acoustic data. Using simulations, we compare the accuracy and precision of abundance estimates using differing amounts of acoustic vocalizations obtained from a clustering algorithm, point count data, and a subset of manually validated acoustic vocalizations. We also use our modelling framework in a case study to estimate abundance of the Eastern Wood‐Pewee (Contopus virens) in Vermont, USA.The simulation study reveals that combining acoustic and point count data via an integrated model improves accuracy and precision of abundance estimates compared with models informed by either acoustic or point count data alone. Improved estimates are obtained across a wide range of scenarios, with the largest gains occurring when detection probability for the point count data is low. Combining acoustic data with only a small number of point count surveys yields estimates of abundance without the need for validating any of the identified vocalizations from the acoustic data. Within our case study, the integrated models provided moderate support for a decline of the Eastern Wood‐Pewee in this region.Our integrated modelling approach combines dense acoustic data with few point count surveys to deliver reliable estimates of species abundance without the need for manual identification of acoustic vocalizations or a prohibitively expensive large number of repeated point count surveys. Our proposed approach offers an efficient monitoring alternative for large spatio‐temporal regions when point count data are difficult to obtain or when monitoring is focused on rare species with low detection probability.more » « less
-
Abstract Environmental and anthropogenic factors affect the population dynamics of migratory species throughout their annual cycles. However, identifying the spatiotemporal drivers of migratory species' abundances is difficult because of extensive gaps in monitoring data. The collection of unstructured opportunistic data by volunteer (citizen science) networks provides a solution to address data gaps for locations and time periods during which structured, design‐based data are difficult or impossible to collect.To estimate population abundance and distribution at broad spatiotemporal extents, we developed an integrated model that incorporates unstructured data during time periods and spatial locations when structured data are unavailable. We validated our approach through simulations and then applied the framework to the eastern North American migratory population of monarch butterflies during their spring breeding period in eastern Texas. Spring climate conditions have been identified as a key driver of monarch population sizes during subsequent summer and winter periods. However, low monarch densities during the spring combined with very few design‐based surveys in the region have limited the ability to isolate effects of spring weather variables on monarchs.Simulation results confirmed the ability of our integrated model to accurately and precisely estimate abundance indices and the effects of covariates during locations and time periods in which structured sampling are lacking. In our case study, we combined opportunistic monarch observations during the spring migration and breeding period with structured data from the summer Midwestern breeding grounds. Our model revealed a nonstationary relationship between weather conditions and local monarch abundance during the spring, driven by spatially varying vegetation and temperature conditions.Data for widespread and migratory species are often fragmented across multiple monitoring programs, potentially requiring the use of both structured and unstructured data sources to obtain complete geographic coverage. Our integrated model can estimate population abundance at broad spatiotemporal extents despite structured data gaps during the annual cycle by leveraging opportunistic data.more » « less
-
Rangel, Thiago F (Ed.)Species distribution models (SDMs) are frequently data-limited. In aquatic habitats, emerging environmental DNA (eDNA) sampling methods can be quicker and more cost-efficient than traditional count and capture surveys, but their utility for fitting SDMs is complicated by dilution, transport, and loss processes that modulate DNA concentrations and mix eDNA from different locations. Past models for estimating organism densities from measured species-specific eDNA concentrations have accounted for how these processes affect expected concentrations. We built off this previous work to construct a linear hierarchical model that also accounts for how they give rise to spatially correlated concentration errors. We applied our model to 60 simulated stream networks and three types of species niches in order to answer two questions: 1) what is the D-optimal sampling design, i.e. where should eDNA samples be positioned to most precisely estimate species–environment relationships? and 2) How does parameter estimation accuracy depend on the stream network’s topological and hydrologic properties? We found that correcting for eDNA dynamics was necessary to obtain consistent parameter estimates, and that relative to a heuristic benchmark design, optimizing sampling locations improved design efficiency by an average of 41.5%. Samples in the D-optimal design tended to be positioned near downstream ends of stream reaches high in the watershed, where eDNA concentration was high and mostly from homogeneous source areas, and they collectively spanned the full ranges of covariates. When measurement error was large, it was often optimal to collect replicate samples from high-information reaches. eDNA-based estimates of species–environment regression parameters were most precise in stream networks that had many reaches, large geographic size, slow flows, and/or high eDNA loss rates. Our study demonstrates the importance and viability of accounting for eDNA dilution, transport, and loss in order to optimize sampling designs and improve the accuracy of eDNA-based species distribution models.more » « less
-
Abstract Spatially resolved transcriptomics technologies have opened new avenues for understanding gene expression heterogeneity in spatial contexts. However, existing methods for identifying spatially variable genes often focus solely on statistical significance, limiting their ability to capture continuous expression patterns and integrate spot-level covariates. To address these challenges, we introduce spVC, a statistical method based on a generalized Poisson model. spVC seamlessly integrates constant and spatially varying effects of covariates, facilitating comprehensive exploration of gene expression variability and enhancing interpretability. Simulation and real data applications confirm spVC’s accuracy in these tasks, highlighting its versatility in spatial transcriptomics analysis.more » « less
An official website of the United States government

