Identifying thermodynamically stable crystal structures remains a key challenge in materials chemistry. Computational crystal structure prediction (CSP) workflows typically rank candidate structures by lattice energy to assess relative stability. Approaches using self-consistent first-principles calculations become prohibitively expensive, especially when millions of energy evaluations are required for complex molecular systems with many atoms per unit cell. Here, we provide a detailed analysis of our methodology and results from the seventh blind test of crystal structure prediction organized by the Cambridge Crystallographic Data Centre (CCDC). We present an approach that significantly accelerates CSP by training target-specific machine learned interatomic potentials (MLIPs). AIMNet2 MLIPs are trained on density functional theory (DFT) calculations of molecular clusters, herein referred to as n-mers. We demonstrate that potentials trained on gas phase dispersion-corrected DFT reference data of n-mers successfully extend to crystalline environments, accurately characterizing the CSP landscape and correctly ranking structures by relative stability. Our methodology effectively captures the underlying physics of thermodynamic crystal stability using only molecular cluster data, avoiding the need for expensive periodic calculations. The performance of target-specific AIMNet2 interatomic potentials is illustrated across diverse chemical systems relevant to pharmaceutical, optoelectronic, and agrochemical applications, demonstrating their promise as efficient alternatives to full DFT calculations for routine CSP tasks.
more »
« less
A data-driven and topological mapping approach for the a priori prediction of stable molecular crystalline hydrates
Predictions of the structures of stoichiometric, fractional, or nonstoichiometric hydrates of organic molecular crystals are immensely challenging due to the extensive search space of different water contents, host molecular placements throughout the crystal, and internal molecular conformations. However, the dry frameworks of these hydrates, especially for nonstoichiometric or isostructural dehydrates, can often be predicted from a standard anhydrous crystal structure prediction (CSP) protocol. Inspired by developments in the field of drug binding, we introduce an efficient data-driven and topologically aware approach for predicting organic molecular crystal hydrate structures through a mapping of water positions within the crystal structure. The method does not require a priori specification of water content and can, therefore, predict stoichiometric, fractional, and nonstoichiometric hydrate structures. This approach, which we term a mapping approach for crystal hydrates (MACH), establishes a set of rules for systematic determination of favorable positions for water insertion within predicted or experimental crystal structures based on considerations of the chemical features of local environments and void regions. The proposed approach is tested on hydrates of three pharmaceutically relevant compounds that exhibit diverse crystal packing motifs and void environments characteristic of hydrate structures. Overall, we show that our mapping approach introduces an advance in the efficient performance of hydrate CSP through generation of stable hydrate stoichiometries at low cost and should be considered an integral component for CSP workflows.
more »
« less
- Award ID(s):
- 1955381
- PAR ID:
- 10401653
- Date Published:
- Journal Name:
- Proceedings of the National Academy of Sciences
- Volume:
- 119
- Issue:
- 43
- ISSN:
- 0027-8424
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Hydrate formation is often unavoidable during crystallization, leading to performance degradation of pharmaceuticals and energetics. In some cases, water molecules trapped within crystal lattices can be substituted for hydrogen peroxide, improving the solubility of drugs and detonation performance of explosives. The present work compares hydrates and hydrogen peroxide solvates in two ways: (1) analyzing structural motifs present in crystal structures accessed from the Cambridge Structural Database and (2) developing potential energy surfaces for water and hydrogen peroxide interacting with functional groups of interest at geometries relevant to the solid state. By elucidating fundamental differences in local interactions that can be formed with molecules of hydrogen peroxide and/or water, the analyses presented here provide a foundation for the design and selection of candidate molecules for the formation of hydrogen peroxide solvates.more » « less
-
Microscopic insights on clathrate hydrate growth from non-equilibrium molecular dynamics simulationsClathrate hydrates form and grow at interfaces. Understanding the relevant molecular processes is crucial for developing hydrate-based technologies. Many computational studies focus on hydrate growth within the aqueous phase using the ‘direct coexistence method’, which is limited in its ability to investigate hydrate film growth at hydrocarbon-water interfaces. To overcome this shortcoming, a new simulation setup is presented here, which allows us to study the growth of a methane hydrate nucleus in a system where oil–water, hydrate-water, and hydrate-oil interfaces are all simultaneously present, thereby mimicking experimental setups. Using this setup, hydrate growth is studied here under the influence of two additives, a polyvinylcaprolactam oligomer and sodium dodecyl sulfate, at varying concentrations. Our results confirm that hydrate films grow along the oil–water interface, in general agreement with visual experimental observations; growth, albeit slower, also occurs at the hydrate-water interface, the interface most often interrogated via simulations. The results obtained demonstrate that the additives present within curved interfaces control the solubility of methane in the aqueous phase, which correlates with hydrate growth rate. Building on our simulation insights, we suggest that by combining data for the potential of mean force profile for methane transport across the oil–water interface and for the average free energy required to perturb a flat interface, it is possible to predict the performance of additives used to control hydrate growth. These insights could be helpful to achieve optimal methane storage in hydrates, one of many applications which are attracting significant fundamental and applied interests.more » « less
-
The accurate prediction of suitable chiral stationary phases (CSPs) for resolving the enantiomers of a given compound poses a significant challenge in chiral chromatography. Previous attempts at developing machine learning models for structure-based CSP prediction have primarily relied on 1D SMILES strings\footnote{The simplified molecular-input line-entry system (SMILES) is a specification in the form of a line notation for describing the structure of chemical species using short ASCII strings.} or 2D graphical representations of molecular structures, and have met with only limited success. In this study, we apply the recently developed 3D molecular conformation representation learning algorithm, which uses rapid conformational analysis and point clouds of atom positions in 3D space, enabling efficient chemical structure-based machine learning. By harnessing the power of the rapid 3D molecular representation learning and a dataset comprising over 300,000 chromatographic enantioseparation records sourced from the literature, our models afford notable improvements for the chemical structure-based choice of appropriate CSP for enantioseparation, paving the way for more efficient and informed decision-making in the field of chiral chromatography.more » « less
-
A seventh blind test of crystal structure prediction was organized by the Cambridge Crystallographic Data Centre featuring seven target systems of varying complexity: a silicon and iodine-containing molecule, a copper coordination complex, a near-rigid molecule, a cocrystal, a polymorphic small agrochemical, a highly flexible polymorphic drug candidate, and a polymorphic morpholine salt. In this first of two parts focusing on structure generation methods, many crystal structure prediction (CSP) methods performed well for the small but flexible agrochemical compound, successfully reproducing the experimentally observed crystal structures, while few groups were successful for the systems of higher complexity. A powder X-ray diffraction (PXRD) assisted exercise demonstrated the use of CSP in successfully determining a crystal structure from a low-quality PXRD pattern. The use of CSP in the prediction of likely cocrystal stoichiometry was also explored, demonstrating multiple possible approaches. Crystallographic disorder emerged as an important theme throughout the test as both a challenge for analysis and a major achievement where two groups blindly predicted the existence of disorder for the first time. Additionally, large-scale comparisons of the sets of predicted crystal structures also showed that some methods yield sets that largely contain the same crystal structures.more » « less
An official website of the United States government

