Abstract Hierarchical probability models are being used more often than non-hierarchical deterministic process models in environmental prediction and forecasting, and Bayesian approaches to fitting such models are becoming increasingly popular. In particular, models describing ecosystem dynamics with multiple states that are autoregressive at each step in time can be treated as statistical state space models (SSMs). In this paper, we examine this subset of ecosystem models, embed a process-based ecosystem model into an SSM, and give closed form Gibbs sampling updates for latent states and process precision parameters when process and observation errors are normally distributed. Here, we use simulated data from an example model (DALECev) and study the effects changing the temporal resolution of observations on the states (observation data gaps), the temporal resolution of the state process (model time step), and the level of aggregation of observations on fluxes (measurements of transfer rates on the state process). We show that parameter estimates become unreliable as temporal gaps between observed state data increase. To improve parameter estimates, we introduce a method of tuning the time resolution of the latent states while still using higher-frequency driver information and show that this helps to improve estimates. Further, we show that data cloning is a suitable method for assessing parameter identifiability in this class of models. Overall, our study helps inform the application of state space models to ecological forecasting applications where (1) data are not available for all states and transfers at the operational time step for the ecosystem model and (2) process uncertainty estimation is desired.
more »
« less
Learning from monitoring networks: Few-large vs. many-small plots and multi-scale analysis
In order to learn about broad scale ecological patterns, data from large-scale surveys must allow us to either estimate the correlations between the environment and an outcome and/or accurately predict ecological patterns. An important part of data collection is the sampling effort used to collect observations, which we decompose into two quantities: the number of observations or plots ( n ) and the per-observation/plot effort ( E ; e.g., area per plot). If we want to understand the relationships between predictors and a response variable, then lower model parameter uncertainty is desirable. If the goal is to predict a response variable, then lower prediction error is preferable. We aim to learn if and when aggregating data can help attain these goals. We find that a small sample size coupled with large observation effort coupled (few large) can yield better predictions when compared to a large number of observations with low observation effort (many small). We also show that the combination of the two values ( n and E ), rather than one alone, has an impact on parameter uncertainty. In an application to Forest Inventory and Analysis (FIA) data, we model the tree density of selected species at various amounts of aggregation using linear regression in order to compare the findings from simulated data to real data. The application supports the theoretical findings that increasing observational effort through aggregation can lead to improved predictions, conditional on the thoughtful aggregation of the observational plots. In particular, aggregations over extremely large and variable covariate space may lead to poor prediction and high parameter uncertainty. Analyses of large-range data can improve with aggregation, with implications for both model evaluation and sampling design: testing model prediction accuracy without an underlying knowledge of the datasets and the scale at which predictor variables operate can obscure meaningful results.
more »
« less
- PAR ID:
- 10433082
- Date Published:
- Journal Name:
- Frontiers in Ecology and Evolution
- Volume:
- 11
- ISSN:
- 2296-701X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Reliable statistical inference is central to forest ecology and management, much of which seeks to estimate population parameters for forest attributes and ecological indicators for biodiversity, functions and services in forest ecosystems. Many populations in nature such as plants or animals are characterized by aggregation of tendencies, introducing a big challenge to sampling. Regardless, a biased or imprecise inference would mislead analysis, hence the conclusion and policymaking. Systematic adaptive cluster sampling (SACS) is designunbiased and particularly efficient for inventorying spatially clustered populations. However, (1) oversampling is common for nonrare variables, making SACS a difficult choice for inventorying common forest attributes or ecological indicators; (2) a SACS sample is not completely specified until the field campaign is completed, making advance budgeting and logistics difficult; (3) even for rare variables, uncertainty regarding the final sample still persists; and (4) a SACS sample may be variable-specific as its formation can be adapted to a particular attribute or indicator, thus risking imbalance or non-representativeness for other jointly observed variables. Consequently, to solve these challenges, we aim to develop a generalized SACS (GSACS) with respect to the design and estimators, and to illustrate its connections with systematic sampling (SS) as has been widely employed by national forest inventories and ecological observation networks around the world. In addition to theoretical derivations, empirical sampling distributions were validated and compared for GSACS and SS using sampling simulations that incorporated a comprehensive set of forest populations exhibiting different spatial patterns. Five conclusions are relevant: (1) in contrast to SACS, GSACS explicitly supports inventorying forest attributes and ecological indicators that are nonrare, and solved SACS problems of oversampling, uncertain sample form, and sample imbalance for alternative attributes or indicators; (2) we demonstrated that SS is a special case of GSACS; (3) even with fewer sample plots, GSACS gives estimates identical to SS; (4) GSACS outperforms SS with respect to inventorying clustered populations and for making domain-specific estimates; and (5) the precision in design-based inference is negatively correlated with the prevalence of a spatial pattern, the range of spatial autocorrelation, and the sample plot size, in a descending order.more » « less
-
Probabilistic predictions support public health planning and decision making, especially in infectious disease emergencies. Aggregating outputs from multiple models yields more robust predictions of outcomes and associated uncertainty. While the selection of an aggregation method can be guided by retrospective performance evaluations, this is not always possible. For example, if predictions are conditional on assumptions about how the future will unfold (e.g. possible interventions), these assumptions may never materialize, precluding any direct comparison between predictions and observations. Here, we summarize literature on aggregating probabilistic predictions, illustrate various methods for infectious disease predictions via simulation, and present a strategy for choosing an aggregation method when empirical validation cannot be used. We focus on the linear opinion pool (LOP) and Vincent average, common methods that make different assumptions about between-prediction uncertainty. We contend that assumptions of the aggregation method should align with a hypothesis about how uncertainty is expressed within and between predictions from different sources. The LOP assumes that between-prediction uncertainty is meaningful and should be retained, while the Vincent average assumes that between-prediction uncertainty is akin to sampling error and should not be preserved. We provide an R package for implementation. Given the rising importance of multi-model infectious disease hubs, our work provides useful guidance on aggregation and a deeper understanding of the benefits and risks of different approaches.more » « less
-
null (Ed.)Spatial patterns in ecology contain useful information about underlying mechanisms and processes. Although there are many summary statistics used to quantify these spatial patterns, there are far fewer models that directly link explicit ecological mechanisms to observed patterns easily derived from available data. We present a model of intraspecific spatial aggregation that quantitatively relates static spatial patterning to negative density dependence. Individuals are placed according to the colonization rule consistent with the Maximum Entropy Theory of Ecology (METE), and die with probability proportional to their abundance raised to a power α, a parameter indicating the degree of density dependence. This model can therefore be interpreted as a hybridization of MaxEnt and mechanism. Our model shows quantitatively and generally that increasing density dependence randomizes spatial patterning. α = 1 recovers the strongly aggregated METE distribution that is consistent with many ecosystems empirically, and as α → 2 our prediction approaches the binomial distribution consistent with random placement. For 1 < α < 2, our model predicts more aggregation than random placement but less than METE. We additionally relate our mechanistic parameter α to the statistical aggregation parameter k in the negative binomial distribution, giving it an ecological interpretation in the context of density dependence. We use our model to analyze two contrasting datasets, a 50 ha tropical forest and a 64 m 2 serpentine grassland plot. For each dataset, we infer α for individual species as well as a community α parameter. We find that α is generally larger in the tightly packed forest than the sparse grassland, and the degree of density dependence increases at smaller scales. These results are consistent with current understanding in both ecosystems, and we infer this underlying density dependence using only empirical spatial patterns. Our model can easily be applied to other datasets where spatially explicit data are available.more » « less
-
Macrosystems EDDIE Module 5 version 2: Introduction to Ecological Forecasting (Instructor Materials)Ecological forecasting is a tool that can be used for understanding and predicting changes in populations, communities, and ecosystems. Ecological forecasting is an emerging approach which provides an estimate of the future state of an ecological system with uncertainty, allowing society to prepare for changes in important ecosystem services. Ecological forecasters develop and update forecasts using the iterative forecasting cycle, in which they make a hypothesis of how an ecological system works; embed their hypothesis in a model; and use the model to make a forecast of future conditions. When observations become available, they can assess the accuracy of their forecast, which indicates if their hypothesis is supported or needs to be updated before the next forecast is generated. In this Macrosystems EDDIE (Environmental Data-Driven Inquiry & Exploration) module, students will apply the iterative forecasting cycle to develop an ecological forecast for a National Ecological Observation Network (NEON) site. Students will use NEON data to build an ecological model that predicts primary productivity. Using their calibrated model, they will learn about the different components of a forecast with uncertainty and compare productivity forecasts among NEON sites. The overarching goal of this module is for students to learn fundamental concepts about ecological forecasting and build a forecast for a NEON site. Students will work with an R Shiny interface to visualize data, build a model, generate a forecast with uncertainty, and then compare the forecast with observations. The A-B-C structure of this module makes it flexible and adaptable to a range of student levels and course structures. This EDI data package contains instructional materials necessary to teach the module. Intructional materials (instructor manual, introductory presentation for the module, and a presentation to introduce students and instructors to R Shiny) are provided in both pdf and editable formats within a compressed file. The module R Shiny application is available at https://macrosystemseddie.shinyapps.io/module5/. Readers are referred to the module landing page for additional information (https://serc.carleton.edu/eddie/teaching_materials/modules/module5.html) and GitHub repo (https://github.com/MacrosystemsEDDIE/module5) and/or Zenodo data package (Moore et al. 2024; DOI: 10.5281/zenodo.10733117) for the R Shiny application code.more » « less
An official website of the United States government

