skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: ISRES+: an improved evolutionary strategy for function minimization to estimate the free parameters of systems biology models
Abstract MotivationMathematical models in systems biology help generate hypotheses, guide experimental design, and infer the dynamics of gene regulatory networks. These models are characterized by phenomenological or mechanistic parameters, which are typically hard to measure. Therefore, efficient parameter estimation is central to model development. Global optimization techniques, such as evolutionary algorithms (EAs), are applied to estimate model parameters by inverse modeling, i.e. calibrating models by minimizing a function that evaluates a measure of the error between model predictions and experimental data. EAs estimate model parameters “fittest individuals” by generating a large population of individuals using strategies like recombination and mutation over multiple “generations.” Typically, only a few individuals from each generation are used to create new individuals in the next generation. Improved Evolutionary Strategy by Stochastic Ranking (ISRES), proposed by Runnarson and Yao, is one such EA that is widely used in systems biology to estimate parameters. ISRES uses information at most from a pair of individuals in any generation to create a new population to minimize the error. In this article, we propose an efficient evolutionary strategy, ISRES+, which builds on ISRES by combining information from all individuals across the population and across all generations to develop a better understanding of the fitness landscape. ResultsISRES+ uses the additional information generated by the algorithm during evolution to approximate the local neighborhood around the best-fit individual using linear least squares fits in one and two dimensions, enabling efficient parameter estimation. ISRES+ outperforms ISRES and results in fitter individuals with a tighter distribution over multiple runs, such that a typical run of ISRES+ estimates parameters with a higher goodness-of-fit compared with ISRES. Availability and implementationAlgorithm and implementation: Github—https://github.com/gtreeves/isres-plus-bandodkar-2022.  more » « less
Award ID(s):
2105619
PAR ID:
10479661
Author(s) / Creator(s):
; ;
Editor(s):
Wren, Jonathan
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Bioinformatics
Volume:
39
Issue:
7
ISSN:
1367-4811
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract MotivationThe scale and scope of comparative trait data are expanding at unprecedented rates, and recent advances in evolutionary modeling and simulation sometimes struggle to match this pace. Well-organized and flexible applications for conducting large-scale simulations of evolution hold promise in this context for understanding models and more so our ability to confidently estimate them with real trait data sampled from nature. ResultsWe introduce TraitTrainR, an R package designed to facilitate efficient, large-scale simulations under complex models of continuous trait evolution. TraitTrainR employs several output formats, supports popular trait data transformations, accommodates multi-trait evolution, and exhibits flexibility in defining input parameter space and model stacking. Moreover, TraitTrainR permits measurement error, allowing for investigation of its potential impacts on evolutionary inference. We envision a wealth of applications of TraitTrainR, and we demonstrate one such example by examining the problem of evolutionary model selection in three empirical phylogenetic case studies. Collectively, these demonstrations of applying TraitTrainR to explore problems in model selection underscores its utility and broader promise for addressing key questions, including those related to experimental design and statistical power, in comparative biology. Availability and implementationTraitTrainR is developed in R 4.4.0 and is freely available at https://github.com/radamsRHA/TraitTrainR/, which includes detailed documentation, quick-start guides, and a step-by-step tutorial. 
    more » « less
  2. Abstract SummaryMolecular mechanisms of biological functions and disease processes are exceptionally complex, and our ability to interrogate and understand relationships is becoming increasingly dependent on the use of computational modeling. We have developed “BioModME,” a standalone R-based web application package, providing an intuitive and comprehensive graphical user interface to help investigators build, solve, visualize, and analyze computational models of complex biological systems. Some important features of the application package include multi-region system modeling, custom reaction rate laws and equations, unit conversion, model parameter estimation utilizing experimental data, and import and export of model information in the Systems Biology Matkup Language format. The users can also export models to MATLAB, R, and Python languages and the equations to LaTeX and Mathematical Markup Language formats. Other important features include an online model development platform, multi-modality visualization tool, and efficient numerical solvers for differential-algebraic equations and optimization. Availability and implementationAll relevant software information including documentation and tutorials can be found at https://mcw.marquette.edu/biomedical-engineering/computational-systems-biology-lab/biomodme.php. Deployed software can be accessed at https://biomodme.ctsi.mcw.edu/. Source code is freely available for download at https://github.com/MCWComputationalBiologyLab/BioModME. 
    more » « less
  3. ABSTRACT Ecological differences among species, particularly dispersal capacity and life history strategies, influence population response to environmental changes. Genetic simulations now allow us to directly incorporate this variation into models of past demographic changes. However, the impact of life history strategies in demographic inference has been far less explored relative to that of dispersal capacity. Here, we utilise individual‐based simulations of a non‐Wright‐Fisher population to ask whether differences in life history traits (the average age of first reproduction of individuals, the average adult mortality and the average number of mates per reproductive season) lead to consistent and predictable differences in the summary statistics of genetic diversity commonly used for simulation‐based parameter estimation and demographic inference. Using a Random Forest model, we also estimate three population parameters (variance in reproductive success, generation time and effective population size) from genome‐wide SNP variation for two bird species known to have distinct life history strategies. The results demonstrate that life history variation leads to predictable differences in patterns of genetic diversity: higher values of life history traits, representing extreme polygamy, long adult longevity and later onset of reproduction, are associated with higher variance in reproductive success, longer generation times, smaller effective population sizes and overall lower genetic diversity. Parameter estimates from empirical datasets also agree with the general expectation that polygamic species with later onset of reproduction and long adult longevity exhibit higher variance in reproductive success, longer generation times and smaller effective population sizes. Since the signal of life history differences is observed in the genetic summary statistics, we argue that simulation‐ and model‐based multi‐species demographic inference will gain from the incorporation of life history information. 
    more » « less
  4. Abstract The estimation of demographic parameters is a key component of evolutionary demography and conservation biology. Capture–mark–recapture methods have served as a fundamental tool for estimating demographic parameters. The accurate estimation of demographic parameters in capture–mark–recapture studies depends on accurate modeling of the observation process. Classic capture–mark–recapture models typically model the observation process as a Bernoulli or categorical trial with detection probability conditional on a marked individual's availability for detection (e.g., alive, or alive and present in a study area). Alternatives to this approach are underused, but may have great utility in capture–recapture studies. In this paper, we explore a simple concept:in the same way that counts contain more information about abundance than simple detection/non‐detection data, the number of encounters of individuals during observation occasions contains more information about the observation process than detection/non‐detection data for individuals during the same occasion. Rather than using Bernoulli or categorical distributions to estimate detection probability, we demonstrate the application of zero‐inflated Poisson and gamma‐Poisson distributions. The use of count distributions allows for inference on availability for encounter, as well as a wide variety of parameterizations for heterogeneity in the observation process. We demonstrate that this approach can accurately recover demographic and observation parameters in the presence of individual heterogeneity in detection probability and discuss some potential future extensions of this method. 
    more » « less
  5. Abstract MotivationHeritability, the proportion of variation in a trait that can be explained by genetic variation, is an important parameter in efforts to understand the genetic architecture of complex phenotypes as well as in the design and interpretation of genome-wide association studies. Attempts to understand the heritability of complex phenotypes attributable to genome-wide single nucleotide polymorphism (SNP) variation data has motivated the analysis of large datasets as well as the development of sophisticated tools to estimate heritability in these datasets. Linear mixed models (LMMs) have emerged as a key tool for heritability estimation where the parameters of the LMMs, i.e. the variance components, are related to the heritability attributable to the SNPs analyzed. Likelihood-based inference in LMMs, however, poses serious computational burdens. ResultsWe propose a scalable randomized algorithm for estimating variance components in LMMs. Our method is based on a method-of-moment estimator that has a runtime complexity O(NMB) for N individuals and M SNPs (where B is a parameter that controls the number of random matrix-vector multiplications). Further, by leveraging the structure of the genotype matrix, we can reduce the time complexity to O(NMBmax( log⁡3N, log⁡3M)).We demonstrate the scalability and accuracy of our method on simulated as well as on empirical data. On standard hardware, our method computes heritability on a dataset of 500 000 individuals and 100 000 SNPs in 38 min. Availability and implementationThe RHE-reg software is made freely available to the research community at: https://github.com/sriramlab/RHE-reg. 
    more » « less