skip to main content


Search for: All records

Award ID contains: 2113350

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Motivation

    Expanding our knowledge of small molecules beyond what is known in nature or designed in wet laboratories promises to significantly advance cheminformatics, drug discovery, biotechnology and material science. In silico molecular design remains challenging, primarily due to the complexity of the chemical space and the non-trivial relationship between chemical structures and biological properties. Deep generative models that learn directly from data are intriguing, but they have yet to demonstrate interpretability in the learned representation, so we can learn more about the relationship between the chemical and biological space. In this article, we advance research on disentangled representation learning for small molecule generation. We build on recent work by us and others on deep graph generative frameworks, which capture atomic interactions via a graph-based representation of a small molecule. The methodological novelty is how we leverage the concept of disentanglement in the graph variational autoencoder framework both to generate biologically relevant small molecules and to enhance model interpretability.

    Results

    Extensive qualitative and quantitative experimental evaluation in comparison with state-of-the-art models demonstrate the superiority of our disentanglement framework. We believe this work is an important step to address key challenges in small molecule generation with deep generative frameworks.

    Availability and implementation

    Training and generated data are made available at https://ieee-dataport.org/documents/dataset-disentangled-representation-learning-interpretable-molecule-generation. All code is made available at https://anonymous.4open.science/r/D-MolVAE-2799/.

    Supplementary information

    Supplementary data are available at Bioinformatics online.

     
    more » « less
  2. The local explanation provides heatmaps on images to explain how Convolutional Neural Networks (CNNs) derive their output. Due to its visual straightforwardness, the method has been one of the most popular explainable AI (XAI) methods for diagnosing CNNs. Through our formative study (S1), however, we captured ML engineers' ambivalent perspective about the local explanation as a valuable and indispensable envision in building CNNs versus the process that exhausts them due to the heuristic nature of detecting vulnerability. Moreover, steering the CNNs based on the vulnerability learned from the diagnosis seemed highly challenging. To mitigate the gap, we designed DeepFuse, the first interactive design that realizes the direct feedback loop between a user and CNNs in diagnosing and revising CNN's vulnerability using local explanations. DeepFuse helps CNN engineers to systemically search unreasonable local explanations and annotate the new boundaries for those identified as unreasonable in a labor-efficient manner. Next, it steers the model based on the given annotation such that the model doesn't introduce similar mistakes. We conducted a two-day study (S2) with 12 experienced CNN engineers. Using DeepFuse, participants made a more accurate and reasonable model than the current state-of-the-art. Also, participants found the way DeepFuse guides case-based reasoning can practically improve their current practice. We provide implications for design that explain how future HCI-driven design can move our practice forward to make XAI-driven insights more actionable.

     
    more » « less
    Free, publicly-accessible full text available September 28, 2024
  3. Free, publicly-accessible full text available September 28, 2024
  4. Free, publicly-accessible full text available July 1, 2024
  5. Spatial prediction is to predict the values of the targeted variable, such as PM2.5 values and temperature, at arbitrary locations based on the collected geospatial data. It greatly affects the key research topics in geoscience in terms of obtaining heterogeneous spatial information (e.g., soil conditions, precipitation rates, wheat yields) for geographic modeling and decision-making at local, regional, and global scales. In-situ data, collected by ground-level in-situ sensors, and remote sensing data, collected by satellite or aircraft, are two important data sources for this task. In-situ data are relatively accurate while sparse and unevenly distributed. Remote sensing data cover large spatial areas but are coarse with low spatiotemporal resolution and prone to interference. How to synergize the complementary strength of these two data types is still a grand challenge. Moreover, it is difficult to model the unknown spatial predictive mapping while handling the trade-off between spatial autocorrelation and heterogeneity. Third, representing spatial relations without substantial information loss is also a critical issue. To address these challenges, we propose a novel Heterogeneous Self-supervised Spatial Prediction (HSSP) framework that synergizes multi-source data by minimizing the inconsistency between in-situ and remote sensing observations. We propose a new deep geometric spatial interpolation model as the prediction backbone that automatically interpolates the values of the targeted variable at unknown locations based on existing observations by taking into account both distance and orientation information. Our proposed interpolator is proven to both be the general form of popular interpolation methods and preserve spatial information. The spatial prediction is enhanced by a novel error-compensation framework to capture the prediction inconsistency due to spatial heterogeneity. Extensive experiments have been conducted on real-world datasets and demonstrated our model’s superiority in performance over state-of-the-art models. 
    more » « less
    Free, publicly-accessible full text available June 1, 2024
  6. Free, publicly-accessible full text available May 16, 2024
  7. Free, publicly-accessible full text available May 9, 2024
  8. Improving the performance and explanations of ML algorithms is a priority for adoption by humans in the real world. In critical domains such as healthcare, such technology has significant potential to reduce the burden on humans and considerably reduce manual assessments by providing quality assistance at scale. In today’s data-driven world, artificial intelligence (AI) systems are still experiencing issues with bias, explainability, and human-like reasoning and interpretability. Causal AI is the technique that can reason and make human-like choices making it possible to go beyond narrow Machine learning-based techniques and can be integrated into human decision-making. It also offers intrinsic explainability, new domain adaptability, bias free predictions, and works with datasets of all sizes. In this tutorial of type lecture style, we detail how a richer representation of causality in AI systems using a knowledge graph (KG) based approach is needed for intervention and counterfactual reasoning (Figure 1), how do we get to model-based and domain explainability, how causal representations helps in web and health care. 
    more » « less
    Free, publicly-accessible full text available April 30, 2024