Abstract Quantitative structure–activity relationship (QSAR) modeling has become a critical tool in drug design. Recently proposed Topological Regression (TR), a computationally efficient and highly interpretable QSAR model that maps distances in the chemical domain to distances in the activity domain, has shown predictive performance comparable to state-of-the-art deep learning-based models. However, TR’s dependence on simple random sampling-based anchor selection and utilization of radial basis function for response reconstruction constrain its interpretability and predictive capacity. To address these limitations, we propose Adaptive Topological Regression (AdapToR) with adaptive anchor selection and optimization-based reconstruction. We evaluated AdapToR on the NCI60 GI50 dataset, which consists of over 50,000 drug responses across 60 human cancer cell lines, and compared its performance to Transformer CNN, Graph Transformer, TR, and other baseline models. The results demonstrate that AdapToR outperforms competing QSAR models for drug response prediction with significantly lower computational cost and greater interpretability as compared to deep learning-based models.
more »
« less
Adaptive wavelet distillation from neural networks through interpretations
Recent deep-learning models have achieved impressive prediction performance, but often sacrifice interpretability and computational efficiency. Interpretability is crucial in many disciplines, such as science and medicine, where models must be carefully vetted or where interpretation is the goal itself. Moreover, interpretable models are concise and often yield computational efficiency. Here, we propose adaptive wavelet distillation (AWD), a method which aims to distill information from a trained neural network into a wavelet transform. Specifically, AWD penalizes feature attributions of a neural network in the wavelet domain to learn an effective multi-resolution wavelet transform. The resulting model is highly predictive, concise, computationally efficient, and has properties (such as a multi-scale structure) which make it easy to interpret. In close collaboration with domain experts, we showcase how AWD addresses challenges in two real-world settings: cosmological parameter inference and molecular-partner prediction. In both cases, AWD yields a scientifically interpretable and concise model which gives predictive performance better than state-of-the-art neural networks. Moreover, AWD identifies predictive features that are scientifically meaningful in the context of respective domains. All code and models are released in a full-fledged package available on Github.
more »
« less
- PAR ID:
- 10343658
- Date Published:
- Journal Name:
- Advances in neural information processing systems
- ISSN:
- 1049-5258
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Electronic health records (EHRs) have been heavily used in modern healthcare systems for recording patients' admission information to health facilities. Many data-driven approaches employ temporal features in EHR for predicting specific diseases, readmission times, and diagnoses of patients. However, most existing predictive models cannot fully utilize EHR data, due to an inherent lack of labels in supervised training for some temporal events. Moreover, it is hard for the existing methods to simultaneously provide generic and personalized interpretability. To address these challenges, we propose Sherbet, a self-supervised graph learning framework with hyperbolic embeddings for temporal health event prediction. We first propose a hyperbolic embedding method with information flow to pretrain medical code representations in a hierarchical structure. We incorporate these pretrained representations into a graph neural network (GNN) to detect disease complications and design a multilevel attention method to compute the contributions of particular diseases and admissions, thus enhancing personalized interpretability. We present a new hierarchy-enhanced historical prediction proxy task in our self-supervised learning framework to fully utilize EHR data and exploit medical domain knowledge. We conduct a comprehensive set of experiments on widely used publicly available EHR datasets to verify the effectiveness of our model. Our results demonstrate the proposed model's strengths in both predictive tasks and interpretable abilities.more » « less
-
Machine learning methods, particularly neural networks trained on large datasets, are transforming how scientists approach scientific discovery and experimental design. However, current state-of-the-art neural networks are limited by their uninterpretability: Despite their excellent accuracy, they cannot describe how they arrived at their predictions. Here, using an “interpretable-by-design” approach, we present a neural network model that provides insights into RNA splicing, a fundamental process in the transfer of genomic information into functional biochemical products. Although we designed our model to emphasize interpretability, its predictive accuracy is on par with state-of-the-art models. To demonstrate the model’s interpretability, we introduce a visualization that, for any given exon, allows us to trace and quantify the entire decision process from input sequence to output splicing prediction. Importantly, the model revealed uncharacterized components of the splicing logic, which we experimentally validated. This study highlights how interpretable machine learning can advance scientific discovery.more » « less
-
Knowledge tracing (KT) refers to the problem of predicting future learner performance given their past performance in educational applications. Recent developments in KT using flexible deep neural network-based models excel at this task. However, these models often offer limited interpretability, thus making them insufficient for personalized learning, which requires using interpretable feedback and actionable recommendations to help learners achieve better learning outcomes. In this paper, we propose attentive knowledge tracing (AKT), which couples flexible attention-based neural network models with a series of novel, interpretable model components inspired by cognitive and psychometric models. AKT uses a novel monotonic attention mechanism that relates a learner’s future responses to assessment questions to their past responses; attention weights are computed using exponential decay and a context-aware relative distance measure, in addition to the similarity between questions. Moreover, we use the Rasch model to regularize the concept and question embeddings; these embeddings are able to capture individual differences among questions on the same concept without using an excessive number of parameters. We conduct experiments on several real-world benchmark datasets and show that AKT outperforms existing KT methods (by up to 6% in AUC in some cases) on predicting future learner responses. We also conduct several case studies and show that AKT exhibits excellent interpretability and thus has potential for automated feedback and personalization in real-world educational settings.more » « less
-
Modern data acquisition routinely produce massive amounts of event sequence data in various domains, such as social media, healthcare, and financial markets. These data often ex- hibit complicated short-term and long-term temporal dependencies. However, most of the ex- isting recurrent neural network-based point process models fail to capture such dependencies, and yield unreliable prediction performance. To address this issue, we propose a Transformer Hawkes Process (THP) model, which leverages the self-attention mechanism to capture long- term dependencies and meanwhile enjoys computational efficiency. Numerical experiments on various datasets show that THP outperforms existing models in terms of both likelihood and event prediction accuracy by a notable margin. Moreover, THP is quite general and can incorpo- rate additional structural knowledge. We provide a concrete example, where THP achieves im- proved prediction performance for learning multiple point processes when incorporating their relational information.more » « less
An official website of the United States government

