skip to main content


Title: Interpretable Machine Learning: Moving from mythos to diagnostics
The emergence of machine learning as a society-changing technology in the past decade has triggered concerns about people's inability to understand the reasoning of increasingly complex models. The field of IML (interpretable machine learning) grew out of these concerns, with the goal of empowering various stakeholders to tackle use cases, such as building trust in models, performing model debugging, and generally informing real human decision-making.  more » « less
Award ID(s):
2112471
NSF-PAR ID:
10326123
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Queue
Volume:
19
Issue:
6
ISSN:
1542-7730
Page Range / eLocation ID:
28 to 56
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Base metal electrode (BME) multilayer ceramic capacitors (MLCCs) are widely used in aerospace, medical, military, and communication applications, emphasizing the need for high reliability. The ongoing advancements in BaTiO3-based MLCC technology have facilitated further miniaturization and improved capacitive volumetric density for both low and high voltage devices. However, concerns persist regarding infant mortality failures and long-term reliability under higher fields and temperatures. To address these concerns, a comprehensive understanding of the mechanisms underlying insulation resistance degradation is crucial. Furthermore, there is a need to develop effective screening procedures during MLCC production and improve the accuracy of mean time to failure (MTTF) predictions. This article reviews our findings on the effect of the burn-in test, a common quality control process, on the dynamics of oxygen vacancies within BME MLCCs. These findings reveal the burn-in test has a negative impact on the lifetime and reliability of BME MLCCS. Moreover, the limitations of existing lifetime prediction models for BME MLCCs are discussed, emphasizing the need for improved MTTF predictions by employing a physics-based machine learning model to overcome the existing models’ limitations. The article also discusses the new physical-based machine learning model that has been developed. While data limitations remain a challenge, the physics-based machine learning approach offers promising results for MTTF prediction in MLCCs, contributing to improved lifetime predictions. Furthermore, the article acknowledges the limitations of relying solely on MTTF to predict MLCCs’ lifetime and emphasizes the importance of developing comprehensive prediction models that predict the entire distribution of failures. 
    more » « less
  2. Historically, two primary criticisms statisticians have of machine learning and deep neural models is their lack of uncertainty quantification and the inability to do inference (i.e., to explain what inputs are important). Explainable AI has developed in the last few years as a sub‐discipline of computer science and machine learning to mitigate these concerns (as well as concerns of fairness and transparency in deep modeling). In this article, our focus is on explaining which inputs are important in models for predicting environmental data. In particular, we focus on three general methods for explainability that are model agnostic and thus applicable across a breadth of models without internal explainability: “feature shuffling”, “interpretable local surrogates”, and “occlusion analysis”. We describe particular implementations of each of these and illustrate their use with a variety of models, all applied to the problem of long‐lead forecasting monthly soil moisture in the North American corn belt given sea surface temperature anomalies in the Pacific Ocean. 
    more » « less
  3. Machine learning models that sense human speech, body placement, and other key features are commonplace in human-robot interaction. However, the deployment of such models in themselves is not without risk. Research in the security of machine learning examines how such models can be exploited and the risks associated with these exploits. Unfortunately, the threat models of risks produced by machine learning security do not incorporate the rich sociotechnical underpinnings of the defenses they propose; as a result, efforts to improve the security of machine learning models may actually increase the difference in performance across different demographic groups, yielding systems that have risk mitigation that work better for one group than another. In this work, we outline why current approaches to machine learning security present DEI concerns for the human-robot interaction community and where there are open areas for collaboration. 
    more » « less
  4. Humans are the final decision makers in critical tasks that involve ethical and legal concerns, ranging from recidivism prediction, to medical diagnosis, to fighting against fake news. Although machine learning models can sometimes achieve impressive performance in these tasks, these tasks are not amenable to full automation. To realize the potential of machine learning for improving human decisions, it is important to understand how assistance from machine learning models affects human performance and human agency. In this paper, we use deception detection as a testbed and investigate how we can harness explanations and predictions of machine learning models to improve human performance while retaining human agency. We propose a spectrum between full human agency and full automation, and develop varying levels of machine assistance along the spectrum that gradually increase the influence of machine predictions. We find that without showing predicted labels, explanations alone slightly improve human performance in the end task. In comparison, human performance is greatly improved by showing predicted labels (>20% relative improvement) and can be further improved by explicitly suggesting strong machine performance. Interestingly, when predicted labels are shown, explanations of machine predictions induce a similar level of accuracy as an explicit statement of strong machine performance. Our results demonstrate a tradeoff between human performance and human agency and show that explanations of machine predictions can moderate this tradeoff. 
    more » « less
  5. Previous studies, both in psychology and linguistics, have shown that individuals with mental illnesses show deviations from normal language use, that these differences can be used to make predictions, and used as a diagnostic tool. Recent studies have shown that machine learning can be used to predict people with mental illnesses based on their writing. However, little attention is paid to the interpretability of the machine learning models. In this talk we will describe our analysis of the machine learning models, the different language patterns that distinguish individuals having mental illnesses from a control group, and the associated privacy concerns. We use a dataset of Tweets that are collected from users who reported a diagnosis of a mental illnesses on Twitter. Given the self-reported nature of the dataset, it is possible that some of these individuals are actively talking about their mental illness on social media. We investigated if the machine learning models are detecting the active mentions of the mental illness or if they are detecting more complex language patterns. We then conducted a feature analysis by creating feature vectors using word unigrams, part of speech tags and word clusters and used feature importance measures and statistical methods to identify important features. This analysis serves two purposes: to understand the machine learning model, and to discover language patterns that would help in identifying people with mental illnesses. Finally, we conducted a qualitative analysis of the misclassifications to understand the potential causes for the misclassifications. 
    more » « less