skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Dynamic predictions of postoperative complications from explainable, uncertainty-aware, and multi-task deep neural networks
Abstract Accurate prediction of postoperative complications can inform shared decisions regarding prognosis, preoperative risk-reduction, and postoperative resource use. We hypothesized that multi-task deep learning models would outperform conventional machine learning models in predicting postoperative complications, and that integrating high-resolution intraoperative physiological time series would result in more granular and personalized health representations that would improve prognostication compared to preoperative predictions. In a longitudinal cohort study of 56,242 patients undergoing 67,481 inpatient surgical procedures at a university medical center, we compared deep learning models with random forests and XGBoost for predicting nine common postoperative complications using preoperative, intraoperative, and perioperative patient data. Our study indicated several significant results across experimental settings that suggest the utility of deep learning for capturing more precise representations of patient health for augmented surgical decision support. Multi-task learning improved efficiency by reducing computational resources without compromising predictive performance. Integrated gradients interpretability mechanisms identified potentially modifiable risk factors for each complication. Monte Carlo dropout methods provided a quantitative measure of prediction uncertainty that has the potential to enhance clinical trust. Multi-task learning, interpretability mechanisms, and uncertainty metrics demonstrated potential to facilitate effective clinical implementation.  more » « less
Award ID(s):
1750192
PAR ID:
10401749
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Scientific Reports
Volume:
13
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. To test the hypothesis that accuracy, discrimination, and precision in predicting postoperative complications improve when using both preoperative and intraoperative data input features versus preoperative data alone. Models that predict postoperative complications often ignore important intraoperative physiological changes. Incorporation of intraoperative physiological data may improve model performance. This retrospective cohort analysis included 52,529 inpatient surgeries at a single institution during a 5 year period. Random forest machine learning models in the validated MySurgeryRisk platform made patient-level predictions for three postoperative complications and mortality during hospital admission using electronic health record data and patient neighborhood characteristics. For each outcome, one model trained with preoperative data alone and one model trained with both preoperative and intraoperative data. Models were compared by accuracy, discrimination (expressed as AUROC), precision (expressed as AUPRC), and reclassification indices (NRI). Machine learning models incorporating both preoperative and intraoperative data had greater accuracy, discrimination, and precision than models using preoperative data alone for predicting all three postoperative complications (intensive care unit length of stay >48 hours, mechanical ventilation >48 hours, and neurological complications including delirium) and in-hospital mortality (accuracy: 88% vs. 77%, AUROC: 0.93 vs. 0.87, AUPRC: 0.21 vs. 0.15). Overall reclassification improvement was 2.9-10.0% for complications and 11.2% for in-hospital mortality. Incorporating both preoperative and intraoperative data significantly increased accuracy, discrimination, and precision for machine learning models predicting postoperative complications. 
    more » « less
  2. BackgroundAlthough conventional prediction models for surgical patients often ignore intraoperative time-series data, deep learning approaches are well-suited to incorporate time-varying and non-linear data with complex interactions. Blood lactate concentration is one important clinical marker that can reflect the adequacy of systemic perfusion during cardiac surgery. During cardiac surgery and cardiopulmonary bypass, minute-level data is available on key parameters that affect perfusion. The goal of this study was to use machine learning and deep learning approaches to predict maximum blood lactate concentrations after cardiac surgery. We hypothesized that models using minute-level intraoperative data as inputs would have the best predictive performance. MethodsAdults who underwent cardiac surgery with cardiopulmonary bypass were eligible. The primary outcome was maximum lactate concentration within 24 h postoperatively. We considered three classes of predictive models, using the performance metric of mean absolute error across testing folds: (1) static models using baseline preoperative variables, (2) augmentation of the static models with intraoperative statistics, and (3) a dynamic approach that integrates preoperative variables with intraoperative time series data. Results2,187 patients were included. For three models that only used baseline characteristics (linear regression, random forest, artificial neural network) to predict maximum postoperative lactate concentration, the prediction error ranged from a median of 2.52 mmol/L (IQR 2.46, 2.56) to 2.58 mmol/L (IQR 2.54, 2.60). The inclusion of intraoperative summary statistics (including intraoperative lactate concentration) improved model performance, with the prediction error ranging from a median of 2.09 mmol/L (IQR 2.04, 2.14) to 2.12 mmol/L (IQR 2.06, 2.16). For two modelling approaches (recurrent neural network, transformer) that can utilize intraoperative time-series data, the lowest prediction error was obtained with a range of median 1.96 mmol/L (IQR 1.87, 2.05) to 1.97 mmol/L (IQR 1.92, 2.05). Intraoperative lactate concentration was the most important predictive feature based on Shapley additive values. Anemia and weight were also important predictors, but there was heterogeneity in the importance of other features. ConclusionPostoperative lactate concentrations can be predicted using baseline and intraoperative data with moderate accuracy. These results reflect the value of intraoperative data in the prediction of clinically relevant outcomes to guide perioperative management. 
    more » « less
  3. Pediatric epilepsy due to drug-resistant Focal Cortical Dysplasia (FCD) presents significant healthcare challenges. Precise preoperative identification of FCD lesions is imperative for surgical planning and patient outcomes. This paper presents a proof-of-concept for an integrated methodology that combines Electroencephalogram (EEG)-based functional connectivity analysis with Magnetic Resonance Imaging (MRI)-derived cortical thickness measurements to identify FCD lesions in pediatric epileptic patients. We examined a single-case clinical scenario from Oregon Health Science and University, consistently identifying the Caudal Middle Frontal (cMFG) region across both EEG and MRI modalities, a finding that was confirmed in the postoperative MRI scan. This cross-validation underscores the potential of the precision of our approach in pinpointing the surgical target region. Despite being constrained by its preliminary nature, our research offers a valuable foundation for a personalized, rigorous method of detecting the location of the FCD lesions. It holds significant clinical implications for managing FCD-related epilepsy. It also portends broader applications in neurology and precision medicine. Nonetheless, further large-scale studies are needed to validate and fine-tune our methodology. Clinical Relevance - This study offers clinicians an advanced, integrated approach to preoperative assessment of FCD lesions, potentially improving the precision of surgical planning in pediatric epilepsy. 
    more » « less
  4. BACKGROUND:Classification of perioperative risk is important for patient care, resource allocation, and guiding shared decision-making. Using discriminative features from the electronic health record (EHR), machine-learning algorithms can create digital phenotypes among heterogenous populations, representing distinct patient subpopulations grouped by shared characteristics, from which we can personalize care, anticipate clinical care trajectories, and explore therapies. We hypothesized that digital phenotypes in preoperative settings are associated with postoperative adverse events including in-hospital and 30-day mortality, 30-day surgical redo, intensive care unit (ICU) admission, and hospital length of stay (LOS). METHODS:We identified all laminectomies, colectomies, and thoracic surgeries performed over a 9-year period from a large hospital system. Seventy-seven readily extractable preoperative features were first selected from clinical consensus, including demographics, medical history, and lab results. Three surgery-specific datasets were built and split into derivation and validation cohorts using chronological occurrence. Consensusk-means clustering was performed independently on each derivation cohort, from which phenotypes’ characteristics were explored. Cluster assignments were used to train a random forest model to assign patient phenotypes in validation cohorts. We reconducted descriptive analyses on validation cohorts to confirm the similarity of patient characteristics with derivation cohorts, and quantified the association of each phenotype with postoperative adverse events by using the area under receiver operating characteristic curve (AUROC). We compared our approach to American Society of Anesthesiologists (ASA) alone and investigated a combination of our phenotypes with the ASA score. RESULTS:A total of 7251 patients met inclusion criteria, of which 2770 were held out in a validation dataset based on chronological occurrence. Using segmentation metrics and clinical consensus, 3 distinct phenotypes were created for each surgery. The main features used for segmentation included urgency of the procedure, preoperative LOS, age, and comorbidities. The most relevant characteristics varied for each of the 3 surgeries. Low-risk phenotype alpha was the most common (2039 of 2770, 74%), while high-risk phenotype gamma was the rarest (302 of 2770, 11%). Adverse outcomes progressively increased from phenotypes alpha to gamma, including 30-day mortality (0.3%, 2.1%, and 6.0%, respectively), in-hospital mortality (0.2%, 2.3%, and 7.3%), and prolonged hospital LOS (3.4%, 22.1%, and 25.8%). When combined with the ASA score, digital phenotypes achieved higher AUROC than the ASA score alone (hospital mortality: 0.91 vs 0.84; prolonged hospitalization: 0.80 vs 0.71). CONCLUSIONS:For 3 frequently performed surgeries, we identified 3 digital phenotypes. The typical profiles of each phenotype were described and could be used to anticipate adverse postoperative events. 
    more » « less
  5. Advancements in computing and data from the near universal acceptance and implementation of electronic health records has been formative for the growth of personalized, automated, and immediate patient care models that were not previously possible. Artificial intelligence (AI) and its subfields of machine learning, reinforcement learning, and deep learning are well-suited to deal with such data. The authors in this paper review current applications of AI in clinical medicine and discuss the most likely future contributions that AI will provide to the healthcare industry. For instance, in response to the need to risk stratify patients, appropriately cultivated and curated data can assist decision-makers in stratifying preoperative patients into risk categories, as well as categorizing the severity of ailments and health for non-operative patients admitted to hospitals. Previous overt, traditional vital signs and laboratory values that are used to signal alarms for an acutely decompensating patient may be replaced by continuously monitoring and updating AI tools that can pick up early imperceptible patterns predicting subtle health deterioration. Furthermore, AI may help overcome challenges with multiple outcome optimization limitations or sequential decision-making protocols that limit individualized patient care. Despite these tremendously helpful advancements, the data sets that AI models train on and develop have the potential for misapplication and thereby create concerns for application bias. Subsequently, the mechanisms governing this disruptive innovation must be understood by clinical decision-makers to prevent unnecessary harm. This need will force physicians to change their educational infrastructure to facilitate understanding AI platforms, modeling, and limitations to best acclimate practice in the age of AI. By performing a thorough narrative review, this paper examines these specific AI applications, limitations, and requisites while reviewing a few examples of major data sets that are being cultivated and curated in the US. 
    more » « less