skip to main content

Title: Definitions, methods, and applications in interpretable machine learning
Machine-learning models have demonstrated great success in learning complex patterns that enable them to make predictions about unobserved data. In addition to using models for prediction, the ability to interpret what a model has learned is receiving an increasing amount of attention. However, this increased focus has led to considerable confusion about the notion of interpretability. In particular, it is unclear how the wide array of proposed interpretation methods are related and what common concepts can be used to evaluate them. We aim to address these concerns by defining interpretability in the context of machine learning and introducing the predictive, descriptive, relevant (PDR) framework for discussing interpretations. The PDR framework provides 3 overarching desiderata for evaluation: predictive accuracy, descriptive accuracy, and relevancy, with relevancy judged relative to a human audience. Moreover, to help manage the deluge of interpretation methods, we introduce a categorization of existing techniques into model-based and post hoc categories, with subgroups including sparsity, modularity, and simulatability. To demonstrate how practitioners can use the PDR framework to evaluate and understand interpretations, we provide numerous real-world examples. These examples highlight the often underappreciated role played by human audiences in discussions of interpretability. Finally, based on our framework, we discuss limitations of existing methods and directions for future work. We hope that this work will provide a common vocabulary that will make it easier for both practitioners and researchers to discuss and choose from the full range of interpretation methods.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Proceedings of the National Academy of Sciences
Page Range / eLocation ID:
22071 to 22080
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Background

    Advanced machine learning models have received wide attention in assisting medical decision making due to the greater accuracy they can achieve. However, their limited interpretability imposes barriers for practitioners to adopt them. Recent advancements in interpretable machine learning tools allow us to look inside the black box of advanced prediction methods to extract interpretable models while maintaining similar prediction accuracy, but few studies have investigated the specific hospital readmission prediction problem with this spirit.


    Our goal is to develop a machine-learning (ML) algorithm that can predict 30- and 90- day hospital readmissions as accurately as black box algorithms while providing medically interpretable insights into readmission risk factors. Leveraging a state-of-art interpretable ML model, we use a two-step Extracted Regression Tree approach to achieve this goal. In the first step, we train a black box prediction algorithm. In the second step, we extract a regression tree from the output of the black box algorithm that allows direct interpretation of medically relevant risk factors. We use data from a large teaching hospital in Asia to learn the ML model and verify our two-step approach.


    The two-step method can obtain similar prediction performance as the best black box model, such as Neural Networks, measured by three metrics: accuracy, the Area Under the Curve (AUC) and the Area Under the Precision-Recall Curve (AUPRC), while maintaining interpretability. Further, to examine whether the prediction results match the known medical insights (i.e., the model is truly interpretable and produces reasonable results), we show that key readmission risk factors extracted by the two-step approach are consistent with those found in the medical literature.


    The proposed two-step approach yields meaningful prediction results that are both accurate and interpretable. This study suggests a viable means to improve the trust of machine learning based models in clinical practice for predicting readmissions through the two-step approach.

    more » « less
  2. This paper describes a process that integrates behavioral and decision science methods to design and evaluate interventions to disrupt illicit behaviors. We developed this process by extending a framework used to study systems with uncertain outcomes, where only partial information is observable, and wherein there are multiple participating parties with competing goals. The extended framework that we propose builds from artefactual data collection, thematic analysis, and descriptive analysis, toward predictive modeling and agent-based modeling. We use agent-based modeling to characterize and predict interactions between system participants for the purpose of improving our understanding of interventional targets in a virtual environment before piloting them in the field. We apply our extended framework to an exploratory case study that examines the potential of worker centers as a venue for deploying interventions to address labor exploitation and human trafficking. This case study focuses on reducing wage theft, the most prevalent form of exploitation experienced by day laborers and applies the first three steps of the extended framework. Specifically, the case study makes a preliminary assessment of two types of social interventions designed to disrupt exploitative processes and improve the experiences of day laborers, namely: (1) advocates training day laborers about their workers’ rights and options that they have for addressing wage theft and (2) media campaigns designed to disseminate similar educational messages about workers’ rights and options to address wage theft through broadcast channels. Applying the extended framework to this case study of day laborers at a worker center demonstrates how digital technology could be used to monitor, evaluate, and support collaborations between worker center staff and day laborers. Ideally, these collaborations could be improved to mitigate the risks and costs of wage theft, build trust between worker center stakeholders, and address communication challenges between day laborers and employers, in the context of temporary work. Based on the application of the extended framework to this case study of worker center day laborers, we discuss how next steps in the research framework should prioritize understanding how and why employers make decisions to participate in wage theft and the potential for restorative justice and equity matching as a relationship model for employers and laborers in a well-being economy. 
    more » « less
  3. Deep learning models have been studied to forecast human events using vast volumes of data, yet they still cannot be trusted in certain applications such as healthcare and disaster assistance due to the lack of interpretability. Providing explanations for event predictions not only helps practitioners understand the underlying mechanism of prediction behavior but also enhances the robustness of event analysis. Improving the transparency of event prediction models is challenging given the following factors: (i) multilevel features exist in event data which creates a challenge to cross-utilize different levels of data; (ii) features across different levels and time steps are heterogeneous and dependent; and (iii) static model-level interpretations cannot be easily adapted to event forecasting given the dynamic and temporal characteristics of the data. Recent interpretation methods have proven their capabilities in tasks that deal with graph-structured or relational data. In this paper, we present a Contextualized Multilevel Feature learning framework, CMF, for interpretable temporal event prediction. It consists of a predictor for forecasting events of interest and an explanation module for interpreting model predictions. We design a new context-based feature fusion method to integrate multiple levels of heterogeneous features. We also introduce a temporal explanation module to determine sequences of text and subgraphs that have crucial roles in a prediction. We conduct extensive experiments on several real-world datasets of political and epidemic events. We demonstrate that the proposed method is competitive compared with the state-of-the-art models while possessing favorable interpretation capabilities. 
    more » « less
  4. null (Ed.)
    The prevalence of mobile phones and wearable devices enables the passive capturing and modeling of human behavior at an unprecedented resolution and scale. Past research has demonstrated the capability of mobile sensing to model aspects of physical health, mental health, education, and work performance, etc. However, most of the algorithms and models proposed in previous work follow a one-size-fits-all (i.e., population modeling) approach that looks for common behaviors amongst all users, disregarding the fact that individuals can behave very differently, resulting in reduced model performance. Further, black-box models are often used that do not allow for interpretability and human behavior understanding. We present a new method to address the problems of personalized behavior classification and interpretability, and apply it to depression detection among college students. Inspired by the idea of collaborative-filtering, our method is a type of memory-based learning algorithm. It leverages the relevance of mobile-sensed behavior features among individuals to calculate personalized relevance weights, which are used to impute missing data and select features according to a specific modeling goal (e.g., whether the student has depressive symptoms) in different time epochs, i.e., times of the day and days of the week. It then compiles features from epochs using majority voting to obtain the final prediction. We apply our algorithm on a depression detection dataset collected from first-year college students with low data-missing rates and show that our method outperforms the state-of-the-art machine learning model by 5.1% in accuracy and 5.5% in F1 score. We further verify the pipeline-level generalizability of our approach by achieving similar results on a second dataset, with an average improvement of 3.4% across performance metrics. Beyond achieving better classification performance, our novel approach is further able to generate personalized interpretations of the models for each individual. These interpretations are supported by existing depression-related literature and can potentially inspire automated and personalized depression intervention design in the future 
    more » « less
  5. Abstract Why the new findings matter

    The process of teaching and learning is complex, multifaceted and dynamic. This paper contributes a seminal resource to highlight the digitisation of the educational sciences by demonstrating how new machine learning methods can be effectively and reliably used in research, education and practical application.

    Implications for educational researchers and policy makers

    The progressing digitisation of societies around the globe and the impact of the SARS‐COV‐2 pandemic have highlighted the vulnerabilities and shortcomings of educational systems. These developments have shown the necessity to provide effective educational processes that can support sometimes overwhelmed teachers to digitally impart knowledge on the plan of many governments and policy makers. Educational scientists, corporate partners and stakeholders can make use of machine learning techniques to develop advanced, scalable educational processes that account for individual needs of learners and that can complement and support existing learning infrastructure. The proper use of machine learning methods can contribute essential applications to the educational sciences, such as (semi‐)automated assessments, algorithmic‐grading, personalised feedback and adaptive learning approaches. However, these promises are strongly tied to an at least basic understanding of the concepts of machine learning and a degree of data literacy, which has to become the standard in education and the educational sciences.

    Demonstrating both the promises and the challenges that are inherent to the collection and the analysis of large educational data with machine learning, this paper covers the essential topics that their application requires and provides easy‐to‐follow resources and code to facilitate the process of adoption.

    more » « less