In this paper, we present the disjunctive Bayesian network (DBN), a novel oncogenetic model with a phylogenetic interpretation. DBN is expressive enough to capture cancer subtypes' trajectories and mutually exclusive relations between alterations from unstratified data.
AutoMeKin2021 is an updated version of tsscds2018, a program for the automated discovery of reaction mechanisms (
- NSF-PAR ID:
- 10449867
- Publisher / Repository:
- Wiley Blackwell (John Wiley & Sons)
- Date Published:
- Journal Name:
- Journal of Computational Chemistry
- Volume:
- 42
- Issue:
- 28
- ISSN:
- 0192-8651
- Page Range / eLocation ID:
- p. 2036-2048
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract Motivation : Cancer is the process of accumulating genetic alterations that confer selective advantages to tumor cells. The order in which aberrations occur is not arbitrary, and inferring the order of events is challenging due to the lack of longitudinal samples from tumors. Moreover, a network model of oncogenesis should capture biological facts such as distinct progression trajectories of cancer subtypes and patterns of mutual exclusivity of alterations in the same pathways.Results : In cases where the number of studied alterations is small (), we provide an efficient dynamic programming implementation of an exact structure learning method that finds a best DBN in the superexponential search space of networks. In rare cases that the number of alterations is large, we provided an efficient genetic algorithm in our software package, OncoBN. Through numerous synthetic and real data experiments, we show OncoBN's ability in inferring ground truth networks and recovering biologically meaningful progression networks.Availability : OncoBN is implemented in R and is available athttps://github.com/phillipnicol/OncoBN . -
Abstract In pursuit of scientific discovery, vast collections of unstructured structural and functional images are acquired; however, only an infinitesimally small fraction of this data is rigorously analyzed, with an even smaller fraction ever being published. One method to accelerate scientific discovery is to extract more insight from costly scientific experiments already conducted. Unfortunately, data from scientific experiments tend only to be accessible by the originator who knows the experiments and directives. Moreover, there are no robust methods to search unstructured databases of images to deduce correlations and insight. Here, we develop a machine learning approach to create image similarity projections to search unstructured image databases. To improve these projections, we develop and train a model to include symmetry-aware features. As an exemplar, we use a set of 25,133 piezoresponse force microscopy images collected on diverse materials systems over five years. We demonstrate how this tool can be used for interactive recursive image searching and exploration, highlighting structural similarities at various length scales. This tool justifies continued investment in federated scientific databases with standardized metadata schemas where the combination of filtering and recursive interactive searching can uncover synthesis-structure-property relations. We provide a customizable open-source package (
https://github.com/m3-learning/Recursive_Symmetry_Aware_Materials_Microstructure_Explorer ) of this interactive tool for researchers to use with their data. -
Abstract One of the Grand Challenges in Science is the construction of the
Tree of Life , an evolutionary tree containing several million species, spanning all life on earth. However, the construction of the Tree of Life is enormously computationally challenging, as all the current most accurate methods are either heuristics forNP -hard optimization problems or Bayesian MCMC methods that sample from tree space. One of the most promising approaches for improving scalability and accuracy for phylogeny estimation uses divide-and-conquer: a set of species is divided into overlapping subsets, trees are constructed on the subsets, and then merged together using a “supertree method”. Here, we present Exact-RFS-2, the first polynomial-time algorithm to find an optimal supertree of two trees, using the Robinson-Foulds Supertree (RFS) criterion (a major approach in supertree estimation that is related to maximum likelihood supertrees), and we prove that finding the RFS of three input trees isNP -hard. Exact-RFS-2 is available in open source form on Github athttps://github.com/yuxilin51/GreedyRFS . -
Premise Morphometric analysis is a common approach for comparing and categorizing botanical samples; however, completing a suite of analyses using existing tools may require a multi‐stage, multi‐program process. To facilitate streamlined analysis within a single program, Morphological Analysis of Size and Shape (
MASS ) for leaves was developed. Its utility is demonstrated using exemplar leaf samples fromAcer saccharum ,Malus domestica , andLithospermum .Methods Exemplar samples were obtained from across a single tree (
Acer saccharum ), three trees in the same species (Malus domestica ), and online, digitized herbarium specimens (Lithospermum ).MASS was used to complete simple geometric measurements of samples, such as length and area, as well as geometric morphological analyses including elliptical Fourier and Procrustes analyses. Principal component analysis (PCA ) of data was also completed within the same program.Results MASS is capable of making desired measurements and analyzing traditional morphometric data as well as landmark and outline data.Discussion Using
MASS , differences were observed among leaves of the three studied taxa, but only inMalus domestica were differences statistically significant or correlated with other morphological features. In the future,MASS could be applied for analysis of other two‐dimensional organs and structures.MASS is available for download athttps://github.com/gillianlynnryan/MASS . -
Abstract The Membranome database provides comprehensive structural information on single‐pass (i.e., bitopic) membrane proteins from six evolutionarily distant organisms, including protein–protein interactions, complexes, mutations, experimental structures, and models of transmembrane α‐helical dimers. We present a new version of this database, Membranome 3.0, which was significantly updated by revising the set of 5,758 bitopic proteins and incorporating models generated by AlphaFold 2 in the database. The AlphaFold models were parsed into structural domains located at the different membrane sides, modified to exclude low‐confidence unstructured terminal regions and signal sequences, validated through comparison with available experimental structures, and positioned with respect to membrane boundaries. Membranome 3.0 was re‐developed to facilitate visualization and comparative analysis of multiple 3D structures of proteins that belong to a specified family, complex, biological pathway, or membrane type. New tools for advanced search and analysis of proteins, their interactions, complexes, and mutations were included. The database is freely accessible at
https://membranome.org .