skip to main content

Title: Matrix methods for stochastic dynamic programming in ecology and evolutionary biology

Organisms are constantly making tradeoffs. These tradeoffs may be behavioural (e.g. whether to focus on foraging or predator avoidance) or physiological (e.g. whether to allocate energy to reproduction or growth). Similarly, wildlife and fishery managers must make tradeoffs while striving for conservation or economic goals (e.g. costs vs. rewards). Stochastic dynamic programming (SDP) provides a powerful and flexible framework within which to explore these tradeoffs. A rich body of mathematical results on SDP exist but have received little attention in ecology and evolution.

Using directed graphs – an intuitive visual model representation – we reformulated SDP models into matrix form. We synthesized relevant existing theoretical results which we then applied to two canonical SDP models in ecology and evolution. We applied these matrix methods to a simple illustrative patch choice example and an existing SDP model of parasitoid wasp behaviour.

The proposed analytical matrix methods provide the same results as standard numerical methods as well as additional insights into the nature and quantity of other, nearly optimal, strategies, which we may also expect to observe in nature. The mathematical results highlighted in this work also explain qualitative aspects of model convergence. An added benefit of the proposed matrix notation is the resulting ease of implementation of Markov chain analysis (an exact solution for the realized states of an individual) rather than Monte Carlo simulations (the standard, approximate method). It also provides an independent validation method for other numerical methods, even in applications focused on short‐term, non‐stationary dynamics.

These methods are useful for obtaining, interpreting, and further analysing model convergence to the optimal time‐independent (i.e. stationary) decisions predicted by an SDP model. SDP is a powerful tool both for theoretical and applied ecology, and an understanding of the mathematical structure underlying SDP models can increase our ability to apply and interpret these models.

more » « less
Author(s) / Creator(s):
 ;  ;  ;  ;
Publisher / Repository:
Date Published:
Journal Name:
Methods in Ecology and Evolution
Page Range / eLocation ID:
p. 1952-1961
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Stage‐based demographic methods, such as matrix population models (MPMs), are powerful tools used to address a broad range of fundamental questions in ecology, evolutionary biology and conservation science. Accordingly, MPMs now exist for over 3000 species worldwide. These data are being digitised as an ongoing process and periodically released into two large open‐access online repositories: the COMPADRE Plant Matrix Database and the COMADRE Animal Matrix Database. During the last decade, data archiving and curation of COMPADRE and COMADRE, and subsequent comparative research, have revealed pronounced variation in how MPMs are parameterized and reported.

    Here, we summarise current issues related to the parameterisation and reporting of MPMs that arise most frequently and outline how they affect MPM construction, analysis, and interpretation. To quantify variation in how MPMs are reported, we present results from a survey identifying key aspects of MPMs that are frequently unreported in manuscripts. We then screen COMPADRE and COMADRE to quantify how often key pieces of information are omitted from manuscripts using MPMs.

    Over 80% of surveyed researchers (n = 60) state a clear benefit to adopting more standardised methodologies for reporting MPMs. Furthermore, over 85% of the 300 MPMs assessed from COMPADRE and COMADRE omitted one or more elements that are key to their accurate interpretation. Based on these insights, we identify fundamental issues that can arise from MPM construction and communication and provide suggestions to improve clarity, reproducibility and future research utilising MPMs and their required metadata. To fortify reproducibility and empower researchers to take full advantage of their demographic data, we introduce a standardised protocol to present MPMs in publications. This standard is linked towww.compadre‐, so that authors wishing to archive their MPMs can do so prior to submission of publications, following examples from other open‐access repositories such as DRYAD, Figshare and Zenodo.

    Combining and standardising MPMs parameterized from populations around the globe and across the tree of life opens up powerful research opportunities in evolutionary biology, ecology and conservation research. However, this potential can only be fully realised by adopting standardised methods to ensure reproducibility.

    more » « less
  2. Biologists routinely fit novel and complex statistical models to push the limits of our understanding. Examples include, but are not limited to, flexible Bayesian approaches (e.g. BUGS, stan), frequentist and likelihood‐based approaches (e.g. packageslme4) and machine learning methods.

    These software and programs afford the user greater control and flexibility in tailoring complex hierarchical models. However, this level of control and flexibility places a higher degree of responsibility on the user to evaluate the robustness of their statistical inference. To determine how often biologists are running model diagnostics on hierarchical models, we reviewed 50 recently published papers in 2021 in the journalNature Ecology & Evolution, and we found that the majority of published papers didnotreport any validation of their hierarchical models, making it difficult for the reader to assess the robustness of their inference. This lack of reporting likely stems from a lack of standardized guidance for best practices and standard methods.

    Here, we provide a guide to understanding and validating complex models using data simulations. To determine how often biologists use data simulation techniques, we also reviewed 50 recently published papers in 2021 in the journalMethods Ecology & Evolution. We found that 78% of the papers that proposed a new estimation technique, package or model used simulations or generated data in some capacity (18 of 23 papers); but very few of those papers (5 of 23 papers) included either a demonstration that the code could recover realistic estimates for a dataset with known parameters or a demonstration of the statistical properties of the approach. To distil the variety of simulations techniques and their uses, we provide a taxonomy of simulation studies based on the intended inference. We also encourage authors to include a basic validation study whenever novel statistical models are used, which in general, is easy to implement.

    Simulating data helps a researcher gain a deeper understanding of the models and their assumptions and establish the reliability of their estimation approaches. Wider adoption of data simulations by biologists can improve statistical inference, reliability and open science practices.

    more » « less
  3. Abstract

    This paper studies several solution paths of sparse quadratic minimization problems as a function of the weighing parameter of the bi-objective of estimation loss versus solution sparsity. Three such paths are considered: the “$$\ell _0$$0-path” where the discontinuous$$\ell _0$$0-function provides the exact sparsity count; the “$$\ell _1$$1-path” where the$$\ell _1$$1-function provides a convex surrogate of sparsity count; and the “capped$$\ell _1$$1-path” where the nonconvex nondifferentiable capped$$\ell _1$$1-function aims to enhance the$$\ell _1$$1-approximation. Serving different purposes, each of these three formulations is different from each other, both analytically and computationally. Our results deepen the understanding of (old and new) properties of the associated paths, highlight the pros, cons, and tradeoffs of these sparse optimization models, and provide numerical evidence to support the practical superiority of the capped$$\ell _1$$1-path. Our study of the capped$$\ell _1$$1-path is interesting in its own right as the path pertains to computable directionally stationary (= strongly locally minimizing in this context, as opposed to globally optimal) solutions of a parametric nonconvex nondifferentiable optimization problem. Motivated by classical parametric quadratic programming theory and reinforced by modern statistical learning studies, both casting an exponential perspective in fully describing such solution paths, we also aim to address the question of whether some of them can be fully traced in strongly polynomial time in the problem dimensions. A major conclusion of this paper is that a path of directional stationary solutions of the capped$$\ell _1$$1-regularized problem offers interesting theoretical properties and practical compromise between the$$\ell _0$$0-path and the$$\ell _1$$1-path. Indeed, while the$$\ell _0$$0-path is computationally prohibitive and greatly handicapped by the repeated solution of mixed-integer nonlinear programs, the quality of$$\ell _1$$1-path, in terms of the two criteria—loss and sparsity—in the estimation objective, is inferior to the capped$$\ell _1$$1-path; the latter can be obtained efficiently by a combination of a parametric pivoting-like scheme supplemented by an algorithm that takes advantage of the Z-matrix structure of the loss function.

    more » « less
  4. Abstract

    The two‐electron reduced density matrix (2RDM) carries enough information to evaluate the electronic energy of a many‐electron system. The variational 2RDM (v2RDM) approach seeks to determine the 2RDM directly, without knowledge of the wave function, by minimizing this energy with respect to variations in the elements of the 2RDM, while also enforcing knownN‐representability conditions. In this tutorial review, we provide an overview of the theoretical underpinnings of the v2RDM approach and theN‐representability constraints that are typically applied to the 2RDM. We also discuss the semidefinite programming (SDP) techniques used in v2RDM computations and provide enough Python code to develop a working v2RDM code that interfaces to thelibSDPlibrary of SDP solvers.

    This article is categorized under:

    Electronic Structure Theory > Ab Initio Electronic Structure Methods

    Software > Quantum Chemistry

    more » « less
  5. We present a unified framework for analyzing the convergence of distributed optimization algorithms by formulating a semidefinite program (SDP) which can be efficiently solved to bound the linear rate of convergence. Two different SDP formulations are considered. First, we formulate an SDP that depends explicitly on the gossip matrix of the network graph. This result provides bounds that depend explicitly on the graph topology, but the SDP dimension scales with the size of the graph. Second, we formulate an SDP that depends implicitly on the gossip matrix via its spectral gap. This result provides coarser bounds, but yields a small SDP that is independent of graph size. Our approach improves upon existing bounds for the algorithms we analyzed, and numerical simulations reveal that our bounds are likely tight. The efficient and automated nature of our analysis makes it a powerful tool for algorithm selection and tuning, and for the discovery of new algorithms as well. 
    more » « less