skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Mobile App That Addresses Interpretability Challenges in Machine Learning–Based Diabetes Predictions: Survey-Based User Study
BackgroundMachine learning approaches, including deep learning, have demonstrated remarkable effectiveness in the diagnosis and prediction of diabetes. However, these approaches often operate as opaque black boxes, leaving health care providers in the dark about the reasoning behind predictions. This opacity poses a barrier to the widespread adoption of machine learning in diabetes and health care, leading to confusion and eroding trust. ObjectiveThis study aimed to address this critical issue by developing and evaluating an explainable artificial intelligence (AI) platform, XAI4Diabetes, designed to empower health care professionals with a clear understanding of AI-generated predictions and recommendations for diabetes care. XAI4Diabetes not only delivers diabetes risk predictions but also furnishes easily interpretable explanations for complex machine learning models and their outcomes. MethodsXAI4Diabetes features a versatile multimodule explanation framework that leverages machine learning, knowledge graphs, and ontologies. The platform comprises the following four essential modules: (1) knowledge base, (2) knowledge matching, (3) prediction, and (4) interpretation. By harnessing AI techniques, XAI4Diabetes forecasts diabetes risk and provides valuable insights into the prediction process and outcomes. A structured, survey-based user study assessed the app’s usability and influence on participants’ comprehension of machine learning predictions in real-world patient scenarios. ResultsA prototype mobile app was meticulously developed and subjected to thorough usability studies and satisfaction surveys. The evaluation study findings underscore the substantial improvement in medical professionals’ comprehension of key aspects, including the (1) diabetes prediction process, (2) data sets used for model training, (3) data features used, and (4) relative significance of different features in prediction outcomes. Most participants reported heightened understanding of and trust in AI predictions following their use of XAI4Diabetes. The satisfaction survey results further revealed a high level of overall user satisfaction with the tool. ConclusionsThis study introduces XAI4Diabetes, a versatile multi-model explainable prediction platform tailored to diabetes care. By enabling transparent diabetes risk predictions and delivering interpretable insights, XAI4Diabetes empowers health care professionals to comprehend the AI-driven decision-making process, thereby fostering transparency and trust. These advancements hold the potential to mitigate biases and facilitate the broader integration of AI in diabetes care.  more » « less
Award ID(s):
2218046 1722913
PAR ID:
10524600
Author(s) / Creator(s):
; ;
Publisher / Repository:
JMIR Formative Research
Date Published:
Journal Name:
JMIR Formative Research
Volume:
7
ISSN:
2561-326X
Page Range / eLocation ID:
e50328
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We propose and test a novel graph learning-based explainable artificial intelligence (XAI) approach to address the challenge of developing explainable predictions of patient length of stay (LoS) in intensive care units (ICUs). Specifically, we address a notable gap in the literature on XAI methods that identify interactions between model input features to predict patient health outcomes. Our model intrinsically constructs a patient-level graph, which identifies the importance of feature interactions for prediction of health outcomes. It demonstrates state-of-the-art explanation capabilities based on identification of salient feature interactions compared with traditional XAI methods for prediction of LoS. We supplement our XAI approach with a small-scale user study, which demonstrates that our model can lead to greater user acceptance of artificial intelligence (AI) model-based decisions by contributing to greater interpretability of model predictions. Our model lays the foundation to develop interpretable, predictive tools that healthcare professionals can utilize to improve ICU resource allocation decisions and enhance the clinical relevance of AI systems in providing effective patient care. Although our primary research setting is the ICU, our graph learning model can be generalized to other healthcare contexts to accurately identify key feature interactions for prediction of other health outcomes, such as mortality, readmission risk, and hospitalizations. 
    more » « less
  2. Abstract ObjectiveThe use of electronic health records (EHRs) for clinical risk prediction is on the rise. However, in many practical settings, the limited availability of task-specific EHR data can restrict the application of standard machine learning pipelines. In this study, we investigate the potential of leveraging language models (LMs) as a means to incorporate supplementary domain knowledge for improving the performance of various EHR-based risk prediction tasks. MethodsWe propose two novel LM-based methods, namely “LLaMA2-EHR” and “Sent-e-Med.” Our focus is on utilizing the textual descriptions within structured EHRs to make risk predictions about future diagnoses. We conduct a comprehensive comparison with previous approaches across various data types and sizes. ResultsExperiments across 6 different methods and 3 separate risk prediction tasks reveal that employing LMs to represent structured EHRs, such as diagnostic histories, results in significant performance improvements when evaluated using standard metrics such as area under the receiver operating characteristic (ROC) curve and precision-recall (PR) curve. Additionally, they offer benefits such as few-shot learning, the ability to handle previously unseen medical concepts, and adaptability to various medical vocabularies. However, it is noteworthy that outcomes may exhibit sensitivity to a specific prompt. ConclusionLMs encompass extensive embedded knowledge, making them valuable for the analysis of EHRs in the context of risk prediction. Nevertheless, it is important to exercise caution in their application, as ongoing safety concerns related to LMs persist and require continuous consideration. 
    more » « less
  3. Abstract Recent advances in explainable artificial intelligence (XAI) methods show promise for understanding predictions made by machine learning (ML) models. XAI explains how the input features are relevant or important for the model predictions. We train linear regression (LR) and convolutional neural network (CNN) models to make 1-day predictions of sea ice velocity in the Arctic from inputs of present-day wind velocity and previous-day ice velocity and concentration. We apply XAI methods to the CNN and compare explanations to variance explained by LR. We confirm the feasibility of using a novel XAI method [i.e., global layerwise relevance propagation (LRP)] to understand ML model predictions of sea ice motion by comparing it to established techniques. We investigate a suite of linear, perturbation-based, and propagation-based XAI methods in both local and global forms. Outputs from different explainability methods are generally consistent in showing that wind speed is the input feature with the highest contribution to ML predictions of ice motion, and we discuss inconsistencies in the spatial variability of the explanations. Additionally, we show that the CNN relies on both linear and nonlinear relationships between the inputs and uses nonlocal information to make predictions. LRP shows that wind speed over land is highly relevant for predicting ice motion offshore. This provides a framework to show how knowledge of environmental variables (i.e., wind) on land could be useful for predicting other properties (i.e., sea ice velocity) elsewhere. Significance StatementExplainable artificial intelligence (XAI) is useful for understanding predictions made by machine learning models. Our research establishes trustability in a novel implementation of an explainable AI method known as layerwise relevance propagation for Earth science applications. To do this, we provide a comparative evaluation of a suite of explainable AI methods applied to machine learning models that make 1-day predictions of Arctic sea ice velocity. We use explainable AI outputs to understand how the input features are used by the machine learning to predict ice motion. Additionally, we show that a convolutional neural network uses nonlinear and nonlocal information in making its predictions. We take advantage of the nonlocality to investigate the extent to which knowledge of wind on land is useful for predicting sea ice velocity elsewhere. 
    more » « less
  4. IntroductionPredictive models have been used to aid early diagnosis of PCOS, though existing models are based on small sample sizes and limited to fertility clinic populations. We built a predictive model using machine learning algorithms based on an outpatient population at risk for PCOS to predict risk and facilitate earlier diagnosis, particularly among those who meet diagnostic criteria but have not received a diagnosis. MethodsThis is a retrospective cohort study from a SafetyNet hospital’s electronic health records (EHR) from 2003-2016. The study population included 30,601 women aged 18-45 years without concurrent endocrinopathy who had any visit to Boston Medical Center for primary care, obstetrics and gynecology, endocrinology, family medicine, or general internal medicine. Four prediction outcomes were assessed for PCOS. The first outcome was PCOS ICD-9 diagnosis with additional model outcomes of algorithm-defined PCOS. The latter was based on Rotterdam criteria and merging laboratory values, radiographic imaging, and ICD data from the EHR to define irregular menstruation, hyperandrogenism, and polycystic ovarian morphology on ultrasound. ResultsWe developed predictive models using four machine learning methods: logistic regression, supported vector machine, gradient boosted trees, and random forests. Hormone values (follicle-stimulating hormone, luteinizing hormone, estradiol, and sex hormone binding globulin) were combined to create a multilayer perceptron score using a neural network classifier. Prediction of PCOS prior to clinical diagnosis in an out-of-sample test set of patients achieved an average AUC of 85%, 81%, 80%, and 82%, respectively in Models I, II, III and IV. Significant positive predictors of PCOS diagnosis across models included hormone levels and obesity; negative predictors included gravidity and positive bHCG. ConclusionMachine learning algorithms were used to predict PCOS based on a large at-risk population. This approach may guide early detection of PCOS within EHR-interfaced populations to facilitate counseling and interventions that may reduce long-term health consequences. Our model illustrates the potential benefits of an artificial intelligence-enabled provider assistance tool that can be integrated into the EHR to reduce delays in diagnosis. However, model validation in other hospital-based populations is necessary. 
    more » « less
  5. Abstract In recent years, predictive machine learning models have gained prominence across various scientific domains. However, their black-box nature necessitates establishing trust in them before accepting their predictions as accurate. One promising strategy involves employing explanation techniques that elucidate the rationale behind a model’s predictions in a way that humans can understand. However, assessing the degree of human interpretability of these explanations is a nontrivial challenge. In this work, we introduce interpretation entropy as a universal solution for evaluating the human interpretability of any linear model. Using this concept and drawing inspiration from classical thermodynamics, we present Thermodynamics-inspired Explainable Representations of AI and other black-box Paradigms, a method for generating optimally human-interpretable explanations in a model-agnostic manner. We demonstrate the wide-ranging applicability of this method by explaining predictions from various black-box model architectures across diverse domains, including molecular simulations, text, and image classification. 
    more » « less