NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

EnGRaiN : a supervised ensemble learning method for recovery of large-scale gene regulatory networks

https://doi.org/10.1093/bioinformatics/btab829

Aluru, Maneesha; Shrivastava, Harsh; Chockalingam, Sriram P.; Shivakumar, Shruti; Aluru, Srinivas; Martelli, ed., Pier Luigi (December 2021, Bioinformatics)

Abstract MotivationReconstruction of genome-scale networks from gene expression data is an actively studied problem. A wide range of methods that differ between the types of interactions they uncover with varying trade-offs between sensitivity and specificity have been proposed. To leverage benefits of multiple such methods, ensemble network methods that combine predictions from resulting networks have been developed, promising results better than or as good as the individual networks. Perhaps owing to the difficulty in obtaining accurate training examples, these ensemble methods hitherto are unsupervised. ResultsIn this article, we introduce EnGRaiN, the first supervised ensemble learning method to construct gene networks. The supervision for training is provided by small training datasets of true edge connections (positives) and edges known to be absent (negatives) among gene pairs. We demonstrate the effectiveness of EnGRaiN using simulated datasets as well as a curated collection of Arabidopsis thaliana datasets we created from microarray datasets available from public repositories. EnGRaiN shows better results not only in terms of receiver operating characteristic and PR characteristics for both real and simulated datasets compared with unsupervised methods for ensemble network construction, but also generates networks that can be mined for elucidating complex biological interactions. Availability and implementationEnGRaiN software and the datasets used in the study are publicly available at the github repository: https://github.com/AluruLab/EnGRaiN. Supplementary informationSupplementary data are available at Bioinformatics online.
more » « less
GRNUlar: A Deep Learning Framework for Recovering Single-Cell Gene Regulatory Networks

https://doi.org/10.1089/cmb.2021.0437

Shrivastava, Harsh; Zhang, Xiuwei; Song, Le; Aluru, Srinivas (January 2022, Journal of Computational Biology)

Full Text Available
Molecule optimization by explainable evolution

Chen, Binghong; Wang, Tianzhe; Li, Chengtao; Dai, Hanjun; Song, Le (January 2021, International Conference on Learning Representation (ICLR))

Optimizing molecules for desired properties is a fundamental yet challenging task in chemistry, material science, and drug discovery. This paper develops a novel algorithm for optimizing molecular properties via an Expectation- Maximization (EM) like explainable evolutionary process. The algorithm is designed to mimic human experts in the process of searching for desirable molecules and alternate between two stages: the first stage on explainable local search which identifies rationales, i.e., critical subgraph patterns accounting for desired molecular properties, and the second stage on molecule completion which explores the larger space of molecules containing good rationales. We test our approach against various baselines on a real-world multi-property optimization task where each method is given the same number of queries to the property oracle. We show that our evolution-by-explanation algorithm is 79% better than the best baseline in terms of a generic metric combining aspects such as success rate, novelty, and diversity. Human expert evaluation on optimized molecules shows that 60% of top molecules obtained from our methods are deemed successful.
more » « less
Full Text Available
GLAD: Learning sparse graph recovery

Shrivastava, Harsh; Chen, Xinshi; Chen, Binghong; Lan, Guanghui; Aluru, Srinivas; Liu, Han; Song, Le (January 2020, International Conference on Learning Representations (ICLR))

Recovering sparse conditional independence graphs from data is a fundamental problem in machine learning with wide applications. A popular formulation of the problem is an L1 regularized maximum likelihood estimation. Many convex optimization algorithms have been designed to solve this formulation to recover the graph structure. Recently, there is a surge of interest to learn algorithms directly based on data, and in this case, learn to map empirical covariance to the sparse precision matrix. However, it is a challenging task in this case, since the symmetric positive definiteness (SPD) and sparsity of the matrix are not easy to enforce in learned algorithms, and a direct mapping from data to precision matrix may contain many parameters. We propose a deep learning architecture, GLAD, which uses an Alternating Minimization (AM) algorithm as our model inductive bias, and learns the model parameters via supervised learning. We show that GLAD learns a very compact and effective model for recovering sparse graphs from data.
more » « less
Full Text Available

Search for: All records