Abstract An inexpensive and reliable method for molecular crystal structure predictions (CSPs) has been developed. The new CSP protocol starts from a two-dimensional graph of crystal’s monomer(s) and utilizes no experimental information. Using results of quantum mechanical calculations for molecular dimers, an accurate two-body, rigid-monomer ab initio-based force field (aiFF) for the crystal is developed. Since CSPs with aiFFs are essentially as expensive as with empirical FFs, tens of thousands of plausible polymorphs generated by the crystal packing procedures can be optimized. Here we show the robustness of this protocol which found the experimental crystal within the 20 most stable predicted polymorphs for each of the 15 investigated molecules. The ranking was further refined by performing periodic density-functional theory (DFT) plus dispersion correction (pDFT+D) calculations for these 20 top-ranked polymorphs, resulting in the experimental crystal ranked as number one for all the systems studied (and the second polymorph, if known, ranked in the top few). Alternatively, the polymorphs generated can be used to improve aiFFs, which also leads to rank one predictions. The proposed CSP protocol should result in aiFFs replacing empirical FFs in CSP research.
more »
« less
Crystal structure predictions for molecules with soft degrees of freedom using intermonomer force fields derived from first principles
A molecular crystal structure prediction (CSP) protocol used in the seventh blind test is presented. The seventh blind test was divided into two stages and included seven targets, with crystals containing from one to three molecules in asymmetric units, monomers built of up to 100 atoms, and all targets containing monomers with flexible degrees of freedom. Some targets were cocrystals and one target was a salt. These diverse targets were treated using a CSP protocol starting from finding the global and local minima conformations of the target molecule. Subsequently, anab initiotwo-body rigid-monomer six-dimensional force field (aiFF) was developed for the global-minimum conformer. These aiFFs were then used in CSPs consisting of packing and lattice-energy minimization stages. Flexible-monomer CSPs were used for some targets. To describe the intramonomer FF, either generic empirical FFs or reparametrized FFs of this type were used, with some parameters fitted toab initioenergies of monomers in the latter case. A novel packing procedure was applied for two targets in stage 1. The success rate in the structure generation stage was 15% in submission phase and 54% in post-submission phase, while the corresponding values in the structure rating stage were 33% and 89%. We conclude that the inexpensive conformer-based approach with rigid-monomer CSPs can be recommended for investigations of crystals with flexible monomers. An advantage of this protocol is that it is fully based on first-principles quantum mechanics and generates tailor-made FFs suitable for use in subsequent molecular dynamics simulations investigating temperature-dependent effects. However, empirical intramonomer FFs reparametrized usingab initiodata are not yet adequate for CSPs.
more »
« less
- Award ID(s):
- 2313826
- PAR ID:
- 10601464
- Publisher / Repository:
- Acta Crystallographica
- Date Published:
- Journal Name:
- Acta Crystallographica Section B Structural Science, Crystal Engineering and Materials
- Volume:
- 80
- Issue:
- 6
- ISSN:
- 2052-5206
- Page Range / eLocation ID:
- 628 to 655
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
A seventh blind test of crystal structure prediction was organized by the Cambridge Crystallographic Data Centre featuring seven target systems of varying complexity: a silicon and iodine-containing molecule, a copper coordination complex, a near-rigid molecule, a cocrystal, a polymorphic small agrochemical, a highly flexible polymorphic drug candidate, and a polymorphic morpholine salt. In this first of two parts focusing on structure generation methods, many crystal structure prediction (CSP) methods performed well for the small but flexible agrochemical compound, successfully reproducing the experimentally observed crystal structures, while few groups were successful for the systems of higher complexity. A powder X-ray diffraction (PXRD) assisted exercise demonstrated the use of CSP in successfully determining a crystal structure from a low-quality PXRD pattern. The use of CSP in the prediction of likely cocrystal stoichiometry was also explored, demonstrating multiple possible approaches. Crystallographic disorder emerged as an important theme throughout the test as both a challenge for analysis and a major achievement where two groups blindly predicted the existence of disorder for the first time. Additionally, large-scale comparisons of the sets of predicted crystal structures also showed that some methods yield sets that largely contain the same crystal structures.more » « less
-
ABSTRACT Accurate prediction of protein–peptide complex structures plays a critical role in structure‐based drug design, including antibody design. Most peptide‐docking benchmark studies were conducted using crystal structures of protein–peptide complexes; as such, the performance of the current peptide docking tools in the practical setting is unknown. Here, the practical setting implies there are no crystal or other experimental structures for the complex, nor for the receptor and peptide. In this work, we have developed a practical docking protocol that incorporated two famous machine learning models, AlphaFold 2 for structural prediction and ANI‐2x for ab initio potential prediction, to achieve a high success rate in modeling protein–peptide complex structures. The docking protocol consists of three major stages. In the first stage, the 3D structure of the receptor is predicted by AlphaFold 2 using the monomer mode, and that of the peptide is predicted by AlphaFold 2 using the multimer mode. We found that it is essential to include the receptor information to generate a high‐quality 3D structure of the peptide. In the second stage, rigid protein–peptide docking is performed using ZDOCK software. In the last stage, the top 10 docking poses are relaxed and refined by ANI‐2x in conjunction with our in‐house geometry optimization algorithm—conjugate gradient with backtracking line search (CG‐BS). CG‐BS was developed by us to more efficiently perform geometry optimization, which takes the potential and force directly from ANI‐2x machine learning models. The docking protocol achieved a very encouraging performance for a set of 62 very challenging protein–peptide systems which had an overall success rate of 34% if only the top 1 docking poses were considered. This success rate increased to 45% if the top 3 docking poses were considered. It is emphasized that this encouraging protein–peptide docking performance was achieved without using any crystal or experimental structures.more » « less
-
Polymorphism in molecular crystals influences their properties and performance. Crystal structure prediction (CSP) can help explore the crystal structure landscape and discover potentially stable polymorphs computationally. We present a new version of the Genarris open-source code, which generates random molecular crystal structures in all space groups and applies physical constraints on intermolecular distances. The main new feature in Genarris 3.0 is the ``Rigid Press algorithm, which uses a regularized hard-sphere potential to compress the unit cell and achieve a maximally close-packed structure based on purely geometric considerations without performing any energy evaluations. In addition, Genarris 3.0 is interfaced with machine-learned interatomic potentials (MLIPs) to accelerate the exploration of the potential energy landscape. We present a new clustering and down-selection workflow that employs the MACE-OFF23(L) MLIPs to perform geometry optimization and energy ranking in the early stages. We use Genarris 3.0 to successfully predict the structure of six targets: aspirin, Target I and Target XXII from previous CSP blind tests, and the energetic materials HMX, CL-20, and DNI. We further analyze the performance of MACE-OFF23(L) compared to dispersion-inclusive density functional theory (DFT) for geometry relaxation and energy ranking. We find significant variability in the performance of MACE-OFF23(L) across chemically diverse targets with particularly poor performance for energetic materials, which is mitigated by our clustering and down-selection procedure. Genarris 3.0 can thus be used effectively to perform CSP and to generate molecular crystal datasets for training ML models.more » « less
-
Abstract Topochemical polymerizations hold the promise of producing high molecular weight and stereoregular single crystalline polymers by first aligning monomers before polymerization. However, monomer modifications often alter the crystal packing and result in non‐reactive polymorphs. Here, we report a systematic study on the side chain functionalization of the bis(indandione) derivative system that can be polymerized under visible light. Precisely engineered side chains help organize the monomer crystals in a one‐dimensional fashion to maintain the topochemical reactivity. By optimizing the side chain length and end group of monomers, the elastic modulus of the resulting polymer single crystals can also be greatly enhanced. Lastly, using ultrasonication, insoluble polymer single crystals can be processed into free‐standing and robust polymer thin films. This work provides new insights on the molecular design of topochemical reactions and paves the way for future applications of this fascinating family of materials.more » « less
An official website of the United States government

