skip to main content


Title: AutoMeKin2021 : An open‐source program for automated reaction discovery
Abstract

AutoMeKin2021 is an updated version of tsscds2018, a program for the automated discovery of reaction mechanisms (J. Comput. Chem.2018,39, 1922). This release features a number of new capabilities: rare‐event molecular dynamics simulations to enhance reaction discovery, extension of the original search algorithm to study van der Waals complexes, use of chemical knowledge, a new search algorithm based on bond‐order time series analysis, statistics of the chemical reaction networks, a web application to submit jobs, and other features. The source code, manual, installation instructions and the website link are available at:https://rxnkin.usc.es/index.php/AutoMeKin

 
more » « less
Award ID(s):
2018427 1763652
NSF-PAR ID:
10449867
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Journal of Computational Chemistry
Volume:
42
Issue:
28
ISSN:
0192-8651
Page Range / eLocation ID:
p. 2036-2048
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Motivation: Cancer is the process of accumulating genetic alterations that confer selective advantages to tumor cells. The order in which aberrations occur is not arbitrary, and inferring the order of events is challenging due to the lack of longitudinal samples from tumors. Moreover, a network model of oncogenesis should capture biological facts such as distinct progression trajectories of cancer subtypes and patterns of mutual exclusivity of alterations in the same pathways.

    In this paper, we present the disjunctive Bayesian network (DBN), a novel oncogenetic model with a phylogenetic interpretation. DBN is expressive enough to capture cancer subtypes' trajectories and mutually exclusive relations between alterations from unstratified data.

    Results: In cases where the number of studied alterations is small (), we provide an efficient dynamic programming implementation of an exact structure learning method that finds a best DBN in the superexponential search space of networks. In rare cases that the number of alterations is large, we provided an efficient genetic algorithm in our software package, OncoBN. Through numerous synthetic and real data experiments, we show OncoBN's ability in inferring ground truth networks and recovering biologically meaningful progression networks.

    Availability: OncoBN is implemented in R and is available athttps://github.com/phillipnicol/OncoBN.

     
    more » « less
  2. Abstract

    In pursuit of scientific discovery, vast collections of unstructured structural and functional images are acquired; however, only an infinitesimally small fraction of this data is rigorously analyzed, with an even smaller fraction ever being published. One method to accelerate scientific discovery is to extract more insight from costly scientific experiments already conducted. Unfortunately, data from scientific experiments tend only to be accessible by the originator who knows the experiments and directives. Moreover, there are no robust methods to search unstructured databases of images to deduce correlations and insight. Here, we develop a machine learning approach to create image similarity projections to search unstructured image databases. To improve these projections, we develop and train a model to include symmetry-aware features. As an exemplar, we use a set of 25,133 piezoresponse force microscopy images collected on diverse materials systems over five years. We demonstrate how this tool can be used for interactive recursive image searching and exploration, highlighting structural similarities at various length scales. This tool justifies continued investment in federated scientific databases with standardized metadata schemas where the combination of filtering and recursive interactive searching can uncover synthesis-structure-property relations. We provide a customizable open-source package (https://github.com/m3-learning/Recursive_Symmetry_Aware_Materials_Microstructure_Explorer) of this interactive tool for researchers to use with their data.

     
    more » « less
  3. Abstract

    One of the Grand Challenges in Science is the construction of theTree of Life, an evolutionary tree containing several million species, spanning all life on earth. However, the construction of the Tree of Life is enormously computationally challenging, as all the current most accurate methods are either heuristics forNP-hard optimization problems or Bayesian MCMC methods that sample from tree space. One of the most promising approaches for improving scalability and accuracy for phylogeny estimation uses divide-and-conquer: a set of species is divided into overlapping subsets, trees are constructed on the subsets, and then merged together using a “supertree method”. Here, we present Exact-RFS-2, the first polynomial-time algorithm to find an optimal supertree of two trees, using the Robinson-Foulds Supertree (RFS) criterion (a major approach in supertree estimation that is related to maximum likelihood supertrees), and we prove that finding the RFS of three input trees isNP-hard. Exact-RFS-2 is available in open source form on Github athttps://github.com/yuxilin51/GreedyRFS.

     
    more » « less
  4. Premise

    Morphometric analysis is a common approach for comparing and categorizing botanical samples; however, completing a suite of analyses using existing tools may require a multi‐stage, multi‐program process. To facilitate streamlined analysis within a single program, Morphological Analysis of Size and Shape (MASS) for leaves was developed. Its utility is demonstrated using exemplar leaf samples fromAcer saccharum,Malus domestica, andLithospermum.

    Methods

    Exemplar samples were obtained from across a single tree (Acer saccharum), three trees in the same species (Malus domestica), and online, digitized herbarium specimens (Lithospermum).MASSwas used to complete simple geometric measurements of samples, such as length and area, as well as geometric morphological analyses including elliptical Fourier and Procrustes analyses. Principal component analysis (PCA) of data was also completed within the same program.

    Results

    MASSis capable of making desired measurements and analyzing traditional morphometric data as well as landmark and outline data.

    Discussion

    UsingMASS, differences were observed among leaves of the three studied taxa, but only inMalus domesticawere differences statistically significant or correlated with other morphological features. In the future,MASScould be applied for analysis of other two‐dimensional organs and structures.MASSis available for download athttps://github.com/gillianlynnryan/MASS.

     
    more » « less
  5. Abstract

    The Membranome database provides comprehensive structural information on single‐pass (i.e., bitopic) membrane proteins from six evolutionarily distant organisms, including protein–protein interactions, complexes, mutations, experimental structures, and models of transmembrane α‐helical dimers. We present a new version of this database, Membranome 3.0, which was significantly updated by revising the set of 5,758 bitopic proteins and incorporating models generated by AlphaFold 2 in the database. The AlphaFold models were parsed into structural domains located at the different membrane sides, modified to exclude low‐confidence unstructured terminal regions and signal sequences, validated through comparison with available experimental structures, and positioned with respect to membrane boundaries. Membranome 3.0 was re‐developed to facilitate visualization and comparative analysis of multiple 3D structures of proteins that belong to a specified family, complex, biological pathway, or membrane type. New tools for advanced search and analysis of proteins, their interactions, complexes, and mutations were included. The database is freely accessible athttps://membranome.org.

     
    more » « less