skip to main content


Search for: All records

Award ID contains: 1841520

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Motivation

    Expanding our knowledge of small molecules beyond what is known in nature or designed in wet laboratories promises to significantly advance cheminformatics, drug discovery, biotechnology and material science. In silico molecular design remains challenging, primarily due to the complexity of the chemical space and the non-trivial relationship between chemical structures and biological properties. Deep generative models that learn directly from data are intriguing, but they have yet to demonstrate interpretability in the learned representation, so we can learn more about the relationship between the chemical and biological space. In this article, we advance research on disentangled representation learning for small molecule generation. We build on recent work by us and others on deep graph generative frameworks, which capture atomic interactions via a graph-based representation of a small molecule. The methodological novelty is how we leverage the concept of disentanglement in the graph variational autoencoder framework both to generate biologically relevant small molecules and to enhance model interpretability.

    Results

    Extensive qualitative and quantitative experimental evaluation in comparison with state-of-the-art models demonstrate the superiority of our disentanglement framework. We believe this work is an important step to address key challenges in small molecule generation with deep generative frameworks.

    Availability and implementation

    Training and generated data are made available at https://ieee-dataport.org/documents/dataset-disentangled-representation-learning-interpretable-molecule-generation. All code is made available at https://anonymous.4open.science/r/D-MolVAE-2799/.

    Supplementary information

    Supplementary data are available at Bioinformatics online.

     
    more » « less
  2. Abstract

    Previous research has noted that many factors greatly influence the spread of COVID‐19. Contrary to explicit factors that are measurable, such as population density, number of medical staff, and the daily test rate, many factors are not directly observable, for instance, culture differences and attitudes toward the disease, which may introduce unobserved heterogeneity. Most contemporary COVID‐19 related research has focused on modeling the relationship between explicitly measurable factors and the response variable of interest (such as the infection rate or the death rate). The infection rate is a commonly used metric for evaluating disease progression and a state's mitigation efforts. Because unobservable sources of heterogeneity cannot be measured directly, it is hard to incorporate them into the quantitative assessment and decision‐making process. In this study, we propose new metrics to study a state's performance by adjusting the measurable county‐level covariates and unobservable state‐level heterogeneity through random effects. A hierarchical linear model (HLM) is postulated, and we calculate two model‐based metrics—the standardized infection ratio (SDIR) and the adjusted infection rate (AIR). This analysis highlights certain time periods when the infection rate for a state was high while their SDIR was low and vice versa. We show that trends in these metrics can give insight into certain aspects of a state's performance. As each state continues to develop their individualized COVID‐19 mitigation strategy and ultimately works to improve their performance, the SDIR and AIR may help supplement the crude infection rate metric to provide a more thorough understanding of a state's performance.

     
    more » « less
  3. Rapid Intensification (RI) in Tropical Cyclone (TC) development is one of the most difficult and still challenging tasks in weather forecasting. In addition to the dynamical numerical simulations, commonly used techniques for RI (as well as TC intensity changes) analysis and prediction are the composite analysis and statistical models based on features derived from the composite analysis. Quite a large number of such selected and pre-determined features related to TC intensity change and RI have been accumulated by the domain scientists, such as those in the widely used SHIPS (Statistical Hurricane Intensity Prediction Scheme) database. Moreover, new features are still being added with new algorithms and/or newly available datasets. However, there are very few unified frameworks for systematically distilling features from a comprehensive data source. One such unified Artificial Intelligence (AI) system was developed for deriving features from TC centers, and here, we expand that system to large-scale environmental condition. In this study, we implemented a deep learning algorithm, the Convolutional Neural Network (CNN), to the European Centre for Medium-Range Weather Forecasts (ECMWF) ERA-Interim reanalysis data and identified and refined potentially new features relevant to RI such as specific humidity in east or northeast, vorticity and horizontal wind in north and south relative to the TC centers, as well as ozone at high altitudes that could help the prediction and understanding of the occurrence of RI based on the deep learning network (named TCNET in this study). By combining the newly derived features and the features from the SHIPS database, the RI prediction performance can be improved by 43%, 23%, and 30% in terms of Kappa, probability of detection (POD), and false alarm rate (FAR) against the same modern classification model but with the SHIPS inputs only. 
    more » « less
    Free, publicly-accessible full text available February 1, 2024
  4. Free, publicly-accessible full text available January 1, 2024
  5. Floods are often associated with hurricanes making landfall. When tropical cyclones/hurricanes make landfall, they are usually accompanied by heavy rainfall and storm surges that inundate coastal areas. The worst natural disaster in the United States, in terms of loss of life and property damage, was caused by hurricane storm surges and their associated coastal flooding. To monitor coastal flooding in the areas affected by hurricanes, we used data from sensors aboard the operational Polar-orbiting and Geostationary Operational Environmental Satellites. This study aims to apply a downscaling model to recent severe coastal flooding events caused by hurricanes. To demonstrate how high-resolution 3D flood mapping can be made from moderate-resolution operational satellite observations, the downscaling model was applied to the catastrophic coastal flooding in Florida due to Hurricane Ian and in New Orleans due to Hurricanes Ida and Laura. The floodwater fraction data derived from the SNPP/NOAA-20 VIIRS (Visible Infrared Imaging Radiometer Suite) observations at the original 375 m resolution were input into the downscaling model to obtain 3D flooding information at 30 m resolution, including flooding extent, water surface level and water depth. Compared to a 2D flood extent map at the VIIRS’ original 375 m resolution, the downscaled 30 m floodwater depth maps, even when shown as 2D images, can provide more details about floodwater distribution, while 3D visualizations can demonstrate floodwater depth more clearly in relative to the terrain and provide a more direct perception of the inundation situations caused by hurricanes. The use of 3D visualization can help users clearly see floodwaters occurring over various types of terrain conditions, thus identifying a hazardous flood from non-hazardous flood types. Furthermore, 3D maps displaying floodwater depth may provide additional information for rescue efforts and damage assessments. The downscaling model can help enhance the capabilities of moderate-to-coarse resolution sensors, such as those used in operational weather satellites, flood detection and monitoring. 
    more » « less
  6. The COVID-19 pandemic has been sweeping across the United States of America since early 2020. The whole world was waiting for vaccination to end this pandemic. Since the approval of the first vaccine by the U.S. CDC on 9 November 2020, nearly 67.5% of the US population have been fully vaccinated by 10 July 2022. While quite successful in controlling the spreading of COVID-19, there were voices against vaccines. Therefore, this research utilizes geo-tweets and Bayesian-based method to investigate public opinions towards vaccines based on (1) the spatiotemporal changes in public engagement and public sentiment; (2) how the public engagement and sentiment react to different vaccine-related topics; (3) how various races behave differently. We connected the phenomenon observed to real-time and historical events. We found that in general the public is positive towards COVID-19 vaccines. Public sentiment positivity went up as more people were vaccinated. Public sentiment on specific topics varied in different periods. African Americans’ sentiment toward vaccines was relatively lower than other races. 
    more » « less