NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Small molecule generation via disentangled representation learning

https://doi.org/10.1093/bioinformatics/btac296

Du, Yuanqi; Guo, Xiaojie; Wang, Yinkai; Shehu, Amarda; Zhao, Liang; Xu, ed., Jinbo (May 2022, Bioinformatics)

Abstract MotivationExpanding our knowledge of small molecules beyond what is known in nature or designed in wet laboratories promises to significantly advance cheminformatics, drug discovery, biotechnology and material science. In silico molecular design remains challenging, primarily due to the complexity of the chemical space and the non-trivial relationship between chemical structures and biological properties. Deep generative models that learn directly from data are intriguing, but they have yet to demonstrate interpretability in the learned representation, so we can learn more about the relationship between the chemical and biological space. In this article, we advance research on disentangled representation learning for small molecule generation. We build on recent work by us and others on deep graph generative frameworks, which capture atomic interactions via a graph-based representation of a small molecule. The methodological novelty is how we leverage the concept of disentanglement in the graph variational autoencoder framework both to generate biologically relevant small molecules and to enhance model interpretability. ResultsExtensive qualitative and quantitative experimental evaluation in comparison with state-of-the-art models demonstrate the superiority of our disentanglement framework. We believe this work is an important step to address key challenges in small molecule generation with deep generative frameworks. Availability and implementationTraining and generated data are made available at https://ieee-dataport.org/documents/dataset-disentangled-representation-learning-interpretable-molecule-generation. All code is made available at https://anonymous.4open.science/r/D-MolVAE-2799/. Supplementary informationSupplementary data are available at Bioinformatics online.
more » « less
Functional Connectivity Prediction With Deep Learning for Graph Transformation

https://doi.org/10.1109/TNNLS.2022.3197337

Etemadyrad, Negar; Gao, Yuyang; Li, Qingzhe; Guo, Xiaojie; Krueger, Frank; Lin, Qixiang; Qiu, Deqiang; Zhao, Liang (October 2022, IEEE Transactions on Neural Networks and Learning Systems)

Full Text Available
Saliency-Regularized Deep Multi-Task Learning

https://doi.org/10.1145/3534678.3539442

Bai, Guangji; Zhao, Liang (August 2022, Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery & Data Mining)

Full Text Available
Source Localization of Graph Diffusion via Variational Autoencoders for Graph Inverse Problems

https://doi.org/10.1145/3534678.3539288

Ling, Chen; Jiang, Junji; Wang, Junxiang; Liang, Zhao (August 2022, Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery & Data Mining)

Full Text Available
Online and Distributed Robust Regressions with Extremely Noisy Labels

https://doi.org/10.1145/3473038

Lei, Shuo; Zhang, Xuchao; Zhao, Liang; Boedihardjo, Arnold P.; Lu, Chang-Tien (June 2022, ACM Transactions on Knowledge Discovery from Data)

In today’s era of big data, robust least-squares regression becomes a more challenging problem when considering the extremely corrupted labels along with explosive growth of datasets. Traditional robust methods can handle the noise but suffer from several challenges when applied in huge dataset including (1) computational infeasibility of handling an entire dataset at once, (2) existence of heterogeneously distributed corruption, and (3) difficulty in corruption estimation when data cannot be entirely loaded. This article proposes online and distributed robust regression approaches, both of which can concurrently address all the above challenges. Specifically, the distributed algorithm optimizes the regression coefficients of each data block via heuristic hard thresholding and combines all the estimates in a distributed robust consolidation. In addition, an online version of the distributed algorithm is proposed to incrementally update the existing estimates with new incoming data. Furthermore, a novel online robust regression method is proposed to estimate under a biased-batch corruption. We also prove that our algorithms benefit from strong robustness guarantees in terms of regression coefficient recovery with a constant upper bound on the error of state-of-the-art batch methods. Extensive experiments on synthetic and real datasets demonstrate that our approaches are superior to those of existing methods in effectiveness, with competitive efficiency.
more » « less
Full Text Available
Accelerated Gradient-free Neural Network Training by Multi-convex Alternating Optimization

https://doi.org/10.1016/j.neucom.2022.02.039

Wang, Junxiang; Li, Hongyi; Zhao, Liang (May 2022, Neurocomputing)

Full Text Available
Spatio-Temporal Event Forecasting Using Incremental Multi-Source Feature Learning

https://doi.org/10.1145/3464976

Zhao, Liang; Gao, Yuyang; Ye, Jieping; Chen, Feng; Ye, Yanfang; Lu, Chang-Tien; Ramakrishnan, Naren (April 2022, ACM Transactions on Knowledge Discovery from Data)

The forecasting of significant societal events such as civil unrest and economic crisis is an interesting and challenging problem which requires both timeliness, precision, and comprehensiveness. Significant societal events are influenced and indicated jointly by multiple aspects of a society, including its economics, politics, and culture. Traditional forecasting methods based on a single data source find it hard to cover all these aspects comprehensively, thus limiting model performance. Multi-source event forecasting has proven promising but still suffers from several challenges, including (1) geographical hierarchies in multi-source data features, (2) hierarchical missing values, (3) characterization of structured feature sparsity, and (4) difficulty in model’s online update with incomplete multiple sources. This article proposes a novel feature learning model that concurrently addresses all the above challenges. Specifically, given multi-source data from different geographical levels, we design a new forecasting model by characterizing the lower-level features’ dependence on higher-level features. To handle the correlations amidst structured feature sets and deal with missing values among the coupled features, we propose a novel feature learning model based on an N th-order strong hierarchy and fused-overlapping group Lasso. An efficient algorithm is developed to optimize model parameters and ensure global optima. More importantly, to enable the model update in real time, the online learning algorithm is formulated and active set techniques are leveraged to resolve the crucial challenge when new patterns of missing features appear in real time. Extensive experiments on 10 datasets in different domains demonstrate the effectiveness and efficiency of the proposed models.
more » « less
Full Text Available
Time series clustering in linear time complexity

https://doi.org/10.1007/s10618-021-00798-w

Li, Xiaosheng; Lin, Jessica; Zhao, Liang (November 2021, Data Mining and Knowledge Discovery)

Full Text Available
Deep Generative Models for Spatial Networks

https://doi.org/10.1145/3447548.3467394

Guo, Xiaojie; Du, Yuanqi; Zhao, Liang (August 2021, Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining)

Full Text Available
Deep graph transformation for attributed, directed, and signed networks

https://doi.org/10.1007/s10115-021-01553-9

Guo, Xiaojie; Zhao, Liang; Homayoun, Houman; Dinakarrao, Sai Manoj (June 2021, Knowledge and Information Systems)
null (Ed.)
Full Text Available

« Prev Next »

Search for: All records