skip to main content

Title: Optimization framework for patient‐specific modeling under uncertainty

Estimating a patient‐specific computational model's parameters relies on data that is often unreliable and ill‐suited for a deterministic approach. We develop an optimization‐based uncertainty quantification framework for probabilistic model tuning that discovers model inputs distributions that generate target output distributions. Probabilistic sampling is performed using a surrogate model for computational efficiency, and a general distribution parameterization is used to describe each input. The approach is tested on seven patient‐specific modeling examples using CircAdapt, a cardiovascular circulatory model. Six examples are synthetic, aiming to match the output distributions generated using known reference input data distributions, while the seventh example uses real‐world patient data for the output distributions. Our results demonstrate the accurate reproduction of the target output distributions, with a correct recreation of the reference inputs for the six synthetic examples. Our proposed approach is suitable for determining the parameter distributions of patient‐specific models with uncertain data and can be used to gain insights into the sensitivity of the model parameters to the measured data.

more » « less
Award ID(s):
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
International Journal for Numerical Methods in Biomedical Engineering
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Nonlinear response history analysis (NLRHA) is generally considered to be a reliable and robust method to assess the seismic performance of buildings under strong ground motions. While NLRHA is fairly straightforward to evaluate individual structures for a select set of ground motions at a specific building site, it becomes less practical for performing large numbers of analyses to evaluate either (1) multiple models of alternative design realizations with a site‐specific set of ground motions, or (2) individual archetype building models at multiple sites with multiple sets of ground motions. In this regard, surrogate models offer an alternative to running repeated NLRHAs for variable design realizations or ground motions. In this paper, a recently developed surrogate modeling technique, called probabilistic learning on manifolds (PLoM), is presented to estimate structural seismic response. Essentially, the PLoM method provides an efficient stochastic model to develop mappings between random variables, which can then be used to efficiently estimate the structural responses for systems with variations in design/modeling parameters or ground motion characteristics. The PLoM algorithm is introduced and then used in two case studies of 12‐story buildings for estimating probability distributions of structural responses. The first example focuses on the mapping between variable design parameters of a multidegree‐of‐freedom analysis model and its peak story drift and acceleration responses. The second example applies the PLoM technique to estimate structural responses for variations in site‐specific ground motion characteristics. In both examples, training data sets are generated for orthogonal input parameter grids, and test data sets are developed for input parameters with prescribed statistical distributions. Validation studies are performed to examine the accuracy and efficiency of the PLoM models. Overall, both examples show good agreement between the PLoM model estimates and verification data sets. Moreover, in contrast to other common surrogate modeling techniques, the PLoM model is able to preserve correlation structure between peak responses. Parametric studies are conducted to understand the influence of different PLoM tuning parameters on its prediction accuracy.

    more » « less
  2. null (Ed.)
    With the increasing adoption of predictive models trained using machine learning across a wide range of high-stakes applications, e.g., health care, security, criminal justice, finance, and education, there is a growing need for effective techniques for explaining such models and their predictions. We aim to address this problem in settings where the predictive model is a black box; That is, we can only observe the response of the model to various inputs, but have no knowledge about the internal structure of the predictive model, its parameters, the objective function, and the algorithm used to optimize the model. We reduce the problem of interpreting a black box predictive model to that of estimating the causal effects of each of the model inputs on the model output, from observations of the model inputs and the corresponding outputs. We estimate the causal effects of model inputs on model output using variants of the Rubin Neyman potential outcomes framework for estimating causal effects from observational data. We show how the resulting causal attribution of responsibility for model output to the different model inputs can be used to interpret the predictive model and to explain its predictions. We present results of experiments that demonstrate the effectiveness of our approach to the interpretation of black box predictive models via causal attribution in the case of deep neural network models trained on one synthetic data set (where the input variables that impact the output variable are known by design) and two real-world data sets: Handwritten digit classification, and Parkinson's disease severity prediction. Because our approach does not require knowledge about the predictive model algorithm and is free of assumptions regarding the black box predictive model except that its input-output responses be observable, it can be applied, in principle, to any black box predictive model. 
    more » « less

    For over 40 yr, the global centroid-moment tensor (GCMT) project has determined location and source parameters for globally recorded earthquakes larger than magnitude 5.0. The GCMT database remains a trusted staple for the geophysical community. Its point-source moment-tensor solutions are the result of inversions that model long-period observed seismic waveforms via normal-mode summation for a 1-D reference earth model, augmented by path corrections to capture 3-D variations in surface wave phase speeds, and to account for crustal structure. While this methodology remains essentially unchanged for the ongoing GCMT catalogue, source inversions based on waveform modelling in low-resolution 3-D earth models have revealed small but persistent biases in the standard modelling approach. Keeping pace with the increased capacity and demands of global tomography requires a revised catalogue of centroid-moment tensors (CMT), automatically and reproducibly computed using Green's functions from a state-of-the-art 3-D earth model. In this paper, we modify the current procedure for the full-waveform inversion of seismic traces for the six moment-tensor parameters, centroid latitude, longitude, depth and centroid time of global earthquakes. We take the GCMT solutions as a point of departure but update them to account for the effects of a heterogeneous earth, using the global 3-D wave speed model GLAD-M25. We generate synthetic seismograms from Green's functions computed by the spectral-element method in the 3-D model, select observed seismic data and remove their instrument response, process synthetic and observed data, select segments of observed and synthetic data based on similarity, and invert for new model parameters of the earthquake’s centroid location, time and moment tensor. The events in our new, preliminary database containing 9382 global event solutions, called CMT3D for ‘3-D centroid-moment tensors’, are on average 4 km shallower, about 1 s earlier, about 5 per cent larger in scalar moment, and more double-couple in nature than in the GCMT catalogue. We discuss in detail the geographical and statistical distributions of the updated solutions, and place them in the context of earlier work. We plan to disseminate our CMT3D solutions via the online ShakeMovie platform.

    more » « less
  4. Key points

    Right heart catheterization data from clinical records of heart transplant patients are used to identify patient‐specific models of the cardiovascular system.

    These patient‐specific cardiovascular models represent a snapshot of cardiovascular function at a given post‐transplant recovery time point.

    This approach is used to describe cardiac function in 10 heart transplant patients, five of which had multiple right heart catheterizations allowing an assessment of cardiac function over time.

    These patient‐specific models are used to predict cardiovascular function in the form of right and left ventricular pressure‐volume loops and ventricular power, an important metric in the clinical assessment of cardiac function.

    Outcomes for the longitudinally tracked patients show that our approach was able to identify the one patient from the group of five that exhibited post‐transplant cardiovascular complications.


    Heart transplant patients are followed with periodic right heart catheterizations (RHCs) to identify post‐transplant complications and guide treatment. Post‐transplant positive outcomes are associated with a steady reduction of right ventricular and pulmonary arterial pressures, toward normal levels of right‐side pressure (about 20 mmHg) measured by RHC. This study shows that more information about patient progression is obtained by combining standard RHC measures with mechanistic computational cardiovascular system models. The purpose of this study is twofold: to understand how cardiovascular system models can be used to represent a patient's cardiovascular state, and to use these models to track post‐transplant recovery and outcome. To obtain reliable parameter estimates comparable within and across datasets, we use sensitivity analysis, parameter subset selection, and optimization to determine patient‐specific mechanistic parameters that can be reliably extracted from the RHC data. Patient‐specific models are identified for 10 patients from their first post‐transplant RHC, and longitudinal analysis is carried out for five patients. Results of the sensitivity analysis and subset selection show that we can reliably estimate seven non‐measurable quantities; namely, ventricular diastolic relaxation, systemic resistance, pulmonary venous elastance, pulmonary resistance, pulmonary arterial elastance, pulmonary valve resistance and systemic arterial elastance. Changes in parameters and predicted cardiovascular function post‐transplant are used to evaluate the cardiovascular state during recovery of five patients. Of these five patients, only one showed inconsistent trends during recovery in ventricular pressure–volume relationships and power output. At the four‐year post‐transplant time point this patient exhibited biventricular failure along with graft dysfunction while the remaining four exhibited no cardiovascular complications.

    more » « less
  5. Background

    Metamodels can address some of the limitations of complex simulation models by formulating a mathematical relationship between input parameters and simulation model outcomes. Our objective was to develop and compare the performance of a machine learning (ML)–based metamodel against a conventional metamodeling approach in replicating the findings of a complex simulation model.


    We constructed 3 ML-based metamodels using random forest, support vector regression, and artificial neural networks and a linear regression-based metamodel from a previously validated microsimulation model of the natural history hepatitis C virus (HCV) consisting of 40 input parameters. Outcomes of interest included societal costs and quality-adjusted life-years (QALYs), the incremental cost-effectiveness (ICER) of HCV treatment versus no treatment, cost-effectiveness analysis curve (CEAC), and expected value of perfect information (EVPI). We evaluated metamodel performance using root mean squared error (RMSE) and Pearson’s R2on the normalized data.


    The R2values for the linear regression metamodel for QALYs without treatment, QALYs with treatment, societal cost without treatment, societal cost with treatment, and ICER were 0.92, 0.98, 0.85, 0.92, and 0.60, respectively. The corresponding R2values for our ML-based metamodels were 0.96, 0.97, 0.90, 0.95, and 0.49 for support vector regression; 0.99, 0.83, 0.99, 0.99, and 0.82 for artificial neural network; and 0.99, 0.99, 0.99, 0.99, and 0.98 for random forest. Similar trends were observed for RMSE. The CEAC and EVPI curves produced by the random forest metamodel matched the results of the simulation output more closely than the linear regression metamodel.


    ML-based metamodels generally outperformed traditional linear regression metamodels at replicating results from complex simulation models, with random forest metamodels performing best.


    Decision-analytic models are frequently used by policy makers and other stakeholders to assess the impact of new medical technologies and interventions. However, complex models can impose limitations on conducting probabilistic sensitivity analysis and value-of-information analysis, and may not be suitable for developing online decision-support tools. Metamodels, which accurately formulate a mathematical relationship between input parameters and model outcomes, can replicate complex simulation models and address the above limitation. The machine learning–based random forest model can outperform linear regression in replicating the findings of a complex simulation model. Such a metamodel can be used for conducting cost-effectiveness and value-of-information analyses or developing online decision support tools.

    more » « less