Abstract Society increasingly demands accurate predictions of complex ecosystem processes under novel conditions to address environmental challenges. However, obtaining the process‐level knowledge required to do so does not necessarily align with the burgeoning use in ecology of correlative model selection criteria, such as Akaike information criterion. These criteria select models based on their ability to reproduce outcomes, not on their ability to accurately represent causal effects. Causal understanding does not require matching outcomes, but rather involves identifying model forms and parameter values that accurately describe processes. We contend that researchers can arrive at incorrect conclusions about cause‐and‐effect relationships by relying on information criteria. We illustrate via a specific example that inference extending beyond prediction into causality can be seriously misled by information‐theoretic evidence. Finally, we identify a solution space to bridge the gap between the correlative inference provided by model selection criteria and a process‐based understanding of ecological systems.
more »
« less
Spherical Minimum Description Length
We consider the problem of model selection using the Minimum Description Length (MDL) criterion for distributions with parameters on the hypersphere. Model selection algorithms aim to find a compromise between goodness of fit and model complexity. Variables often considered for complexity penalties involve number of parameters, sample size and shape of the parameter space, with the penalty term often referred to as stochastic complexity. Current model selection criteria either ignore the shape of the parameter space or incorrectly penalize the complexity of the model, largely because typical Laplace approximation techniques yield inaccurate results for curved spaces. We demonstrate how the use of a constrained Laplace approximation on the hypersphere yields a novel complexity measure that more accurately reflects the geometry of these spherical parameters spaces. We refer to this modified model selection criterion as spherical MDL. As proof of concept, spherical MDL is used for bin selection in histogram density estimation, performing favorably against other model selection criteria.
more »
« less
- Award ID(s):
- 1743050
- PAR ID:
- 10088689
- Date Published:
- Journal Name:
- Entropy
- Volume:
- 20
- Issue:
- 8
- ISSN:
- 1099-4300
- Page Range / eLocation ID:
- 575
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Simulation models of critical systems often have parameters that need to be calibrated using observed data. For expensive simulation models, calibration is done using an emulator of the simulation model built on simulation output at different parameter settings. Using intelligent and adaptive selection of parameters to build the emulator can drastically improve the efficiency of the calibration process. The article proposes a sequential framework with a novel criterion for parameter selection that targets learning the posterior density of the parameters. The emergent behavior from this criterion is that exploration happens by selecting parameters in uncertain posterior regions while simultaneously exploitation happens by selecting parameters in regions of high posterior density. The advantages of the proposed method are illustrated using several simulation experiments and a nuclear physics reaction model.more » « less
-
Mahoney, Michael (Ed.)In this paper, we study the Radial Basis Function (RBF) approximation to differential operators on smooth tensor fields defined on closed Riemannian submanifolds of Euclidean space, identified by randomly sampled point cloud data. The formulation in this paper leverages a fundamental fact that the covariant derivative on a submanifold is the projection of the directional derivative in the ambient Euclidean space onto the tangent space of the submanifold. To differentiate a test function (or vector field) on the submanifold with respect to the Euclidean metric, the RBF interpolation is applied to extend the function (or vector field) in the ambient Euclidean space. When the manifolds are unknown, we develop an improved second-order local SVD technique for estimating local tangent spaces on the manifold. When the classical pointwise non-symmetric RBF formulation is used to solve Laplacian eigenvalue problems, we found that while accurate estimation of the leading spectra can be obtained with large enough data, such an approximation often produces irrelevant complex-valued spectra (or pollution) as the true spectra are real-valued and positive. To avoid such an issue, we introduce a symmetric RBF discrete approximation of the Laplacians induced by a weak formulation on appropriate Hilbert spaces. Unlike the non-symmetric approximation, this formulation guarantees non-negative real-valued spectra and the orthogonality of the eigenvectors. Theoretically, we establish the convergence of the eigenpairs of both the Laplace-Beltrami operator and Bochner Laplacian for the symmetric formulation in the limit of large data with convergence rates. Numerically, we provide supporting examples for approximations of the Laplace-Beltrami operator and various vector Laplacians, including the Bochner, Hodge, and Lichnerowicz Laplacians.more » « less
-
Both the computational costs and the accuracy of the invariant-imbedding T-matrix method escalate with increasing the truncation numberNat which the expansions of the electromagnetic fields in terms of vector spherical harmonics are truncated. Thus, it becomes important in calculation of the single-scattering optical properties to chooseNjust large enough to satisfy an appropriate convergence criterion; thisNwe call the optimal truncation number. We present a new convergence criterion that is based on the scattering phase function rather than on the scattering cross section. For a selection of homogeneous particles that have been used in previous single-scattering studies, we consider how the optimalNmay be related to the size parameter, the index of refraction, and particle shape. We investigate a functional form for this relation that generalizes previous formulae involving only size parameter, a form that shows some success in summarizing our computational results. Our results indicate clearly the sensitivity of optimal truncation number to the index of refraction, as well as the difficulty of cleanly separating this dependence from the dependence on particle shape.more » « less
-
Nonlinear state-space models are ubiquitous in modeling real-world dynamical systems. Sequential Monte Carlo (SMC) techniques, also known as particle methods, are a well-known class of parameter estimation methods for this general class of state-space models. Existing SMC-based techniques rely on excessive sampling of the parameter space, which makes their computation intractable for large systems or tall data sets. Bayesian optimization techniques have been used for fast inference in state-space models with intractable likelihoods. These techniques aim to find the maximum of the likelihood function by sequential sampling of the parameter space through a single SMC approximator. Various SMC approximators with different fidelities and computational costs are often available for sample- based likelihood approximation. In this paper, we propose a multi-fidelity Bayesian optimization algorithm for the inference of general nonlinear state-space models (MFBO-SSM), which enables simultaneous sequential selection of parameters and approximators. The accuracy and speed of the algorithm are demonstrated by numerical experiments using synthetic gene expression data from a gene regulatory network model and real data from the VIX stock price index.more » « less
An official website of the United States government

