skip to main content

Title: GeoMol: Torsional Geometric Generation of Molecular 3D Conformer Ensembles
Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery. Existing generative models have several drawbacks including lack of modeling important molecular geometry elements (e.g. torsion angles), separate optimization stages prone to error accumulation, and the need for structure fine-tuning based on approximate classical force-fields or computationally expensive methods such as metadynamics with approximate quantum mechanics calculations at each geometry. We propose GeoMol--an end-to-end, non-autoregressive and SE(3)-invariant machine learning approach to generate distributions of low-energy molecular 3D conformers. Leveraging the power of message passing neural networks (MPNNs) to capture local and global graph information, we predict local atomic 3D structures and torsion angles, avoiding unnecessary over-parameterization of the geometric degrees of freedom (e.g. one angle per non-terminal bond). Such local predictions suffice both for the training loss computation, as well as for the full deterministic conformer assembly (at test time). We devise a non-adversarial optimal transport based loss function to promote diverse conformer generation. GeoMol predominantly outperforms popular open-source, commercial, or state-of-the-art machine learning (ML) models, while achieving significant speed-ups. We expect such differentiable 3D structure generators to significantly impact molecular modeling and related applications.
Authors:
; ; ; ; ; ;
Award ID(s):
1918839
Publication Date:
NSF-PAR ID:
10320124
Journal Name:
Advances in neural information processing systems
ISSN:
1049-5258
Sponsoring Org:
National Science Foundation
More Like this
  1. Recently, molecular fingerprints extracted from three-dimensional (3D) structures using advanced mathematics, such as algebraic topology, differential geometry, and graph theory have been paired with efficient machine learning, especially deep learning algorithms to outperform other methods in drug discovery applications and competitions. This raises the question of whether classical 2D fingerprints are still valuable in computer-aided drug discovery. This work considers 23 datasets associated with four typical problems, namely protein–ligand binding, toxicity, solubility and partition coefficient to assess the performance of eight 2D fingerprints. Advanced machine learning algorithms including random forest, gradient boosted decision tree, single-task deep neural network and multitask deep neural network are employed to construct efficient 2D-fingerprint based models. Additionally, appropriate consensus models are built to further enhance the performance of 2D-fingerprint-based methods. It is demonstrated that 2D-fingerprint-based models perform as well as the state-of-the-art 3D structure-based models for the predictions of toxicity, solubility, partition coefficient and protein–ligand binding affinity based on only ligand information. However, 3D structure-based models outperform 2D fingerprint-based methods in complex-based protein–ligand binding affinity predictions.
  2. With the recent advancement of deep learning, molecular representation learning -- automating the discovery of feature representation of molecular structure, has attracted significant attention from both chemists and machine learning researchers. Deep learning can facilitate a variety of downstream applications, including bio-property prediction, chemical reaction prediction, etc. Despite the fact that current SMILES string or molecular graph molecular representation learning algorithms (via sequence modeling and graph neural networks, respectively) have achieved promising results, there is no work to integrate the capabilities of both approaches in preserving molecular characteristics (e.g, atomic cluster, chemical bond) for further improvement. In this paper, we propose GraSeq, a joint graph and sequence representation learning model for molecular property prediction. Specifically, GraSeq makes a complementary combination of graph neural networks and recurrent neural networks for modeling two types of molecular inputs, respectively. In addition, it is trained by the multitask loss of unsupervised reconstruction and various downstream tasks, using limited size of labeled datasets. In a variety of chemical property prediction tests, we demonstrate that our GraSeq model achieves better performance than state-of-the-art approaches.
  3. Embedding properties of network realizations of dissipative reduced order models Jörn Zimmerling, Mikhail Zaslavsky,Rob Remis, Shasri Moskow, Alexander Mamonov, Murthy Guddati, Vladimir Druskin, and Liliana Borcea Mathematical Sciences Department, Worcester Polytechnic Institute https://www.wpi.edu/people/vdruskin Abstract Realizations of reduced order models of passive SISO or MIMO LTI problems can be transformed to tridiagonal and block-tridiagonal forms, respectively, via dierent modications of the Lanczos algorithm. Generally, such realizations can be interpreted as ladder resistor-capacitor-inductor (RCL) networks. They gave rise to network syntheses in the rst half of the 20th century that was at the base of modern electronics design and consecutively to MOR that tremendously impacted many areas of engineering (electrical, mechanical, aerospace, etc.) by enabling ecient compression of the underlining dynamical systems. In his seminal 1950s works Krein realized that in addition to their compressing properties, network realizations can be used to embed the data back into the state space of the underlying continuum problems. In more recent works of the authors Krein's ideas gave rise to so-called nite-dierence Gaussian quadrature rules (FDGQR), allowing to approximately map the ROM state-space representation to its full order continuum counterpart on a judicially chosen grid. Thus, the state variables can be accessed directly from themore »transfer function without solving the full problem and even explicit knowledge of the PDE coecients in the interior, i.e., the FDGQR directly learns" the problem from its transfer function. This embedding property found applications in PDE solvers, inverse problems and unsupervised machine learning. Here we show a generalization of this approach to dissipative PDE problems, e.g., electromagnetic and acoustic wave propagation in lossy dispersive media. Potential applications include solution of inverse scattering problems in dispersive media, such as seismic exploration, radars and sonars. To x the idea, we consider a passive irreducible SISO ROM fn(s) = Xn j=1 yi s + σj , (62) assuming that all complex terms in (62) come in conjugate pairs. We will seek ladder realization of (62) as rjuj + vj − vj−1 = −shˆjuj , uj+1 − uj + ˆrj vj = −shj vj , (63) for j = 0, . . . , n with boundary conditions un+1 = 0, v1 = −1, and 4n real parameters hi, hˆi, ri and rˆi, i = 1, . . . , n, that can be considered, respectively, as the equivalent discrete inductances, capacitors and also primary and dual conductors. Alternatively, they can be viewed as respectively masses, spring stiness, primary and dual dampers of a mechanical string. Reordering variables would bring (63) into tridiagonal form, so from the spectral measure given by (62 ) the coecients of (63) can be obtained via a non-symmetric Lanczos algorithm written in J-symmetric form and fn(s) can be equivalently computed as fn(s) = u1. The cases considered in the original FDGQR correspond to either (i) real y, θ or (ii) real y and imaginary θ. Both cases are covered by the Stieltjes theorem, that yields in case (i) real positive h, hˆ and trivial r, rˆ, and in case (ii) real positive h,r and trivial hˆ,rˆ. This result allowed us a simple interpretation of (62) as the staggered nite-dierence approximation of the underlying PDE problem [2]. For PDEs in more than one variables (including topologically rich data-manifolds), a nite-dierence interpretation is obtained via a MIMO extensions in block form, e.g., [4, 3]. The main diculty of extending this approach to general passive problems is that the Stieltjes theory is no longer applicable. Moreover, the tridiagonal realization of a passive ROM transfer function (62) via the ladder network (63) cannot always be obtained in port-Hamiltonian form, i.e., the equivalent primary and dual conductors may change sign [1]. 100 Embedding of the Stieltjes problems, e.g., the case (i) was done by mapping h and hˆ into values of acoustic (or electromagnetic) impedance at grid cells, that required a special coordinate stretching (known as travel time coordinate transform) for continuous problems. Likewise, to circumvent possible non-positivity of conductors for the non-Stieltjes case, we introduce an additional complex s-dependent coordinate stretching, vanishing as s → ∞ [1]. This stretching applied in the discrete setting induces a diagonal factorization, removes oscillating coecients, and leads to an accurate embedding for moderate variations of the coecients of the continuum problems, i.e., it maps discrete coecients onto the values of their continuum counterparts. Not only does this embedding yields an approximate linear algebraic algorithm for the solution of the inverse problems for dissipative PDEs, it also leads to new insight into the properties of their ROM realizations. We will also discuss another approach to embedding, based on Krein-Nudelman theory [5], that results in special data-driven adaptive grids. References [1] Borcea, Liliana and Druskin, Vladimir and Zimmerling, Jörn, A reduced order model approach to inverse scattering in lossy layered media, Journal of Scientic Computing, V. 89, N1, pp. 136,2021 [2] Druskin, Vladimir and Knizhnerman, Leonid, Gaussian spectral rules for the three-point second dierences: I. A two-point positive denite problem in a semi-innite domain, SIAM Journal on Numerical Analysis, V. 37, N 2, pp.403422, 1999 [3] Druskin, Vladimir and Mamonov, Alexander V and Zaslavsky, Mikhail, Distance preserving model order reduction of graph-Laplacians and cluster analysis, Druskin, Vladimir and Mamonov, Alexander V and Zaslavsky, Mikhail, Journal of Scientic Computing, V. 90, N 1, pp 130, 2022 [4] Druskin, Vladimir and Moskow, Shari and Zaslavsky, Mikhail LippmannSchwingerLanczos algorithm for inverse scattering problems, Inverse Problems, V. 37, N. 7, 2021, [5] Mark Adolfovich Nudelman The Krein String and Characteristic Functions of Maximal Dissipative Operators, Journal of Mathematical Sciences, 2004, V 124, pp 49184934 Go back to Plenary Speakers Go back to Speakers Go back« less
  4. Observational estimates of Antarctic ice loss have accelerated in recent decades, and worst-case scenarios of modeling studies have suggested potentially catastrophic sea level rise (~2 meters) by the end of the century. However, modeled contributions to global mean sea level from the Antarctic ice-sheet (AIS) in the 21st century are highly uncertain, in part because ice-sheet model parameters are poorly constrained. Individual ice-sheet model runs are also deterministic and not computationally efficient enough to generate the continuous probability distributions required for incorporation into a holistic framework of probabilistic sea-level projections. To address these shortfalls, we statistically emulate an ice-sheet model using Gaussian Process (GP) regression. GP modeling is a non-parametric machine-learning technique which maps inputs (e.g. forcing or model parameters) to target outputs (e.g. sea-level contributions from the Antarctic ice-sheet) and has the inherent and important advantage that emulator uncertainty is explicitly quantified. We construct emulators for the last interglacial period and an RCP8.5 scenario, and separately for the western, eastern, and total AIS. Separate emulation of western and eastern AIS is important because their evolutions and physical responses to climate forcing are distinct. The emulators are trained on 196 ensemble members for each scenario, composed by varying the parametersmore »of maximum rate of ice-cliff wastage and the coefficient of hydrofracturing. We condition the emulators on last interglacial proxy sea-level records and modern GRACE measurements and exclude poor-fitting ensemble members. The resulting emulators are sampled to produce probability distributions that fill intermediate gaps between discrete ice-sheet model outcomes. We invert emulated high and low probability sea-level contributions in 2100 to explore 21st century evolution pathways; results highlight the deep uncertainty of ice-sheet model physics and the importance of using observations to narrow the range of parameters. Our approach is designed to be flexible such that other ice-sheet models or parameter spaces may be substituted and explored with the emulator.« less
  5. Despite its potential to overcome the design and processing barriers of traditional subtractive and formative manufacturing techniques, the use of laser powder bed fusion (LPBF) metal additive manufacturing is currently limited due to its tendency to create flaws. A multitude of LPBF-related flaws, such as part-level deformation, cracking, and porosity are linked to the spatiotemporal temperature distribution in the part during the process. The temperature distribution, also called the thermal history, is a function of several factors encompassing material properties, part geometry and orientation, processing parameters, placement of supports, among others. These broad range of factors are difficult and expensive to optimize through empirical testing alone. Consequently, fast and accurate models to predict the thermal history are valuable for mitigating flaw formation in LPBF-processed parts. In our prior works, we developed a graph theory-based approach for predicting the temperature distribution in LPBF parts. This mesh-free approach was compared with both non-proprietary and commercial finite element packages, and the thermal history predictions were experimentally validated with in- situ infrared thermal imaging data. It was found that the graph theory-derived thermal history predictions converged within 30–50% of the time of non-proprietary finite element analysis for a similar level of prediction error. However,more »these prior efforts were based on small prismatic and cylinder-shaped LPBF parts. In this paper, our objective was to scale the graph theory approach to predict the thermal history of large volume, complex geometry LPBF parts. To realize this objective, we developed and applied three computational strategies to predict the thermal history of a stainless steel (SAE 316L) impeller having outside diameter 155 mm and vertical height 35 mm (700 layers). The impeller was processed on a Renishaw AM250 LPBF system and required 16 h to complete. During the process, in-situ layer-by-layer steady state surface temperature measurements for the impeller were obtained using a calibrated longwave infrared thermal camera. As an example of the outcome, on implementing one of the three strategies reported in this work, which did not reduce or simplify the part geometry, the thermal history of the impeller was predicted with approximate mean absolute error of 6% (standard deviation 0.8%) and root mean square error 23 K (standard deviation 3.7 K). Moreover, the thermal history was simulated within 40 min using desktop computing, which is considerably less than the 16 h required to build the impeller part. Furthermore, the graph theory thermal history predictions were compared with a proprietary LPBF thermal modeling software and non-proprietary finite element simulation. For a similar level of root mean square error (28 K), the graph theory approach converged in 17 min, vs. 4.5 h for non-proprietary finite element analysis.« less