skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Defining model complexity: An ecological perspective
Abstract Models have become a key component of scientific hypothesis testing and climate and sustainability planning, as enabled by increased data availability and computing power. As a result, understanding how the perceived ‘complexity’ of a model corresponds to its accuracy and predictive power has become a prevalent research topic. However, a wide variety of definitions of model complexity have been proposed and used, leading to an imprecise understanding of what model complexity is and its consequences across research studies, study systems, and disciplines. Here, we propose a more explicit definition of model complexity, incorporating four facets—model class, model inputs, model parameters, and computational complexity—which are modulated by the complexity of the real‐world process being modelled. We illustrate these facets with several examples drawn from ecological literature. Overall, we argue that precise terminology and metrics of model complexity (e.g., number of parameters, number of inputs) may be necessary to characterize the emergent outcomes of complexity, including model comparison, model performance, model transferability and decision support.  more » « less
Award ID(s):
1926388
PAR ID:
10512428
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Meteorological Applications
Volume:
31
Issue:
3
ISSN:
1350-4827
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract As modeling tools and approaches become more advanced, ecological models are becoming more complex. Traditional sensitivity analyses can struggle to identify the nonlinearities and interactions emergent from such complexity, especially across broad swaths of parameter space. This limits understanding of the ecological mechanisms underlying model behavior. Machine learning approaches are a potential answer to this issue, given their predictive ability when applied to complex large datasets. While perceptions that machine learning is a “black box” linger, we seek to illuminate its interpretive potential in ecological modeling. To do so, we detail our process of applying random forests to complex model dynamics to produce both high predictive accuracy and elucidate the ecological mechanisms driving our predictions. Specifically, we employ an empirically rooted ontogenetically stage-structured consumer-resource simulation model. Using simulation parameters as feature inputs and simulation output as dependent variables in our random forests, we extended feature analyses into a simple graphical analysis from which we reduced model behavior to three core ecological mechanisms. These ecological mechanisms reveal the complex interactions between internal plant demography and trophic allocation driving community dynamics while preserving the predictive accuracy achieved by our random forests. 
    more » « less
  2. Abstract In the primate visual system, visual object recognition involves a series of cortical areas arranged hierarchically along the ventral visual pathway. As information flows through this hierarchy, neurons become progressively tuned to more complex image features. The circuit mechanisms and computations underlying the increasing complexity of these receptive fields (RFs) remain unidentified. To understand how this complexity emerges in the secondary visual area (V2), we investigated the functional organization of inputs from the primary visual cortex (V1) to V2 by combining retrograde anatomical tracing of these inputs with functional imaging of feature maps in macaque monkey V1 and V2. We found that V1 neurons sending inputs to single V2 orientation columns have a broad range of preferred orientations, but are strongly biased towards the orientation represented at the injected V2 site. For each V2 site, we then constructed a feedforward model based on the linear combination of its anatomically- identified large-scale V1 inputs, and studied the response proprieties of the generated V2 RFs. We found that V2 RFs derived from the linear feedforward model were either elongated versions of V1 filters or had spatially complex structures. These modeled RFs predicted V2 neuron responses to oriented grating stimuli with high accuracy. Remarkably, this simple model also explained the greater selectivity to naturalistic textures of V2 cells compared to their V1 input cells. Our results demonstrate that simple linear combinations of feedforward inputs can account for the orientation selectivity and texture sensitivity of V2 RFs. 
    more » « less
  3. Diffusion models have become the most popular approach to deep generative modeling of images, largely due to their empirical performance and reliability. From a theoretical standpoint, a number of recent works~\cite{chen2022,chen2022improved,benton2023linear} have studied the iteration complexity of sampling, assuming access to an accurate diffusion model. In this work, we focus on understanding the \emph{sample complexity} of training such a model; how many samples are needed to learn an accurate diffusion model using a sufficiently expressive neural network? Prior work~\cite{BMR20} showed bounds polynomial in the dimension, desired Total Variation error, and Wasserstein error. We show an \emph{exponential improvement} in the dependence on Wasserstein error and depth, along with improved dependencies on other relevant parameters. 
    more » « less
  4. Abstract Calibration of agent‐based models (ABMs) is a major challenge due to the complex nature of the systems being modeled, the heterogeneous nature of geographical regions, the varying effects of model inputs on the outputs, and computational intensity. Nevertheless, ABMs need to be carefully tuned to achieve the desirable goal of simulating spatiotemporal phenomena of interest, and a well‐calibrated model is expected to achieve an improved understanding of the phenomena. To address some of the above challenges, this article proposes an integrated framework of global sensitivity analysis (GSA) and calibration, called GSA‐CAL. Specifically, variance‐based GSA is applied to identify input parameters with less influence on differences between simulated outputs and observations. By dropping these less influential input parameters in the calibration process, this research reduces the computational intensity of calibration. Since GSA requires many simulation runs, due to ABMs' stochasticity, we leverage the high‐performance computing power provided by the advanced cyberinfrastructure. A spatially explicit ABM of influenza transmission is used as the case study to demonstrate the utility of the framework. Leveraging GSA, we were able to exclude less influential parameters in the model calibration process and demonstrate the importance of revising local settings for an epidemic pattern in an outbreak. 
    more » « less
  5. Multiscale systems biology is having an increasingly powerful impact on our understanding of the interconnected molecular, cellular, and microenvironmental drivers of tumor growth and the effects of novel drugs and drug combinations for cancer therapy. Agent-based models (ABMs) that treat cells as autonomous decision-makers, each with their own intrinsic characteristics, are a natural platform for capturing intratumoral heterogeneity. Agent-based models are also useful for integrating the multiple time and spatial scales associated with vascular tumor growth and response to treatment. Despite all their benefits, the computational costs of solving agent-based models escalate and become prohibitive when simulating millions of cells, making parameter exploration and model parameterization from experimental data very challenging. Moreover, such data are typically limited, coarse-grained and may lack any spatial resolution, compounding these challenges. We address these issues by developing a first-of-its-kind method that leverages explicitly formulated surrogate models (SMs) to bridge the current computational divide between agent-based models and experimental data. In our approach, Surrogate Modeling for Reconstructing Parameter Surfaces (SMoRe ParS), we quantify the uncertainty in the relationship between agent-based model inputs and surrogate model parameters, and between surrogate model parameters and experimental data. In this way, surrogate model parameters serve as intermediaries between agent-based model input and data, making it possible to use them for calibration and uncertainty quantification of agent-based model parameters that map directly onto an experimental data set. We illustrate the functionality and novelty of Surrogate Modeling for Reconstructing Parameter Surfaces by applying it to an agent-based model of 3D vascular tumor growth, and experimental data in the form of tumor volume time-courses. Our method is broadly applicable to situations where preserving underlying mechanistic information is of interest, and where computational complexity and sparse, noisy calibration data hinder model parameterization. 
    more » « less