skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on December 1, 2025

Title: The seventh blind test of crystal structure prediction: structure generation methods
A seventh blind test of crystal structure prediction was organized by the Cambridge Crystallographic Data Centre featuring seven target systems of varying complexity: a silicon and iodine-containing molecule, a copper coordination complex, a near-rigid molecule, a cocrystal, a polymorphic small agrochemical, a highly flexible polymorphic drug candidate, and a polymorphic morpholine salt. In this first of two parts focusing on structure generation methods, many crystal structure prediction (CSP) methods performed well for the small but flexible agrochemical compound, successfully reproducing the experimentally observed crystal structures, while few groups were successful for the systems of higher complexity. A powder X-ray diffraction (PXRD) assisted exercise demonstrated the use of CSP in successfully determining a crystal structure from a low-quality PXRD pattern. The use of CSP in the prediction of likely cocrystal stoichiometry was also explored, demonstrating multiple possible approaches. Crystallographic disorder emerged as an important theme throughout the test as both a challenge for analysis and a major achievement where two groups blindly predicted the existence of disorder for the first time. Additionally, large-scale comparisons of the sets of predicted crystal structures also showed that some methods yield sets that largely contain the same crystal structures.  more » « less
Award ID(s):
2118890
PAR ID:
10579945
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; « less
Publisher / Repository:
International Union of Crystallography
Date Published:
Journal Name:
Acta Crystallographica Section B Structural Science, Crystal Engineering and Materials
Volume:
80
Issue:
6
ISSN:
2052-5206
Page Range / eLocation ID:
517 to 547
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. A seventh blind test of crystal structure prediction has been organized by the Cambridge Crystallographic Data Centre. The results are presented in two parts, with this second part focusing on methods for ranking crystal structures in order of stability. The exercise involved standardized sets of structures seeded from a range of structure generation methods. Participants from 22 groups applied several periodic DFT-D methods, machine learned potentials, force fields derived from empirical data or quantum chemical calculations, and various combinations of the above. In addition, one non-energy-based scoring function was used. Results showed that periodic DFT-D methods overall agreed with experimental data within expected error margins, while one machine learned model, applying system-specific AIMnet potentials, agreed with experiment in many cases demonstrating promise as an efficient alternative to DFT-based methods. For target XXXII, a consensus was reached across periodic DFT methods, with consistently high predicted energies of experimental forms relative to the global minimum (above 4 kJ mol−1at both low and ambient temperatures) suggesting a more stable polymorph is likely not yet observed. The calculation of free energies at ambient temperatures offered improvement of predictions only in some cases (for targets XXVII and XXXI). Several avenues for future research have been suggested, highlighting the need for greater efficiency considering the vast amounts of resources utilized in many cases. 
    more » « less
  2. Abstract The nucleobase derivative 5‐aminouracil (AUr, C4H5N3O2) is of interest for its biological activity, yet the solid state structure of this compound has remained elusive owing to its propensity to crystallize as aggregates of microcrystalline particles. Here we report the first single‐crystal structure of AUr determined from synchrotron x‐ray diffraction data. An early crystal structure prediction effort, which assumed that AUr was rigid in the isolated molecule optimized conformation, provided several poor matches to the simulated PXRD pattern. Revisiting these crystal structures, by periodic electronic level modelling (PBE‐TS optimization) gave more realistic relative lattice energies, but a good match to the experimental powder pattern required using the experimental cell parameters. PXRD and Raman spectroscopy suggest that phase impurities may be present in the bulk crystallization product, though the identity of alternative polymorphs could not be confirmed on the basis of the data available. 
    more » « less
  3. The goal of molecular crystal structure prediction (CSP) is to find all the plausible polymorphs for a given molecule. This requires performing global optimization over a high-dimensional search space. Genetic algorithms (GAs) perform global optimization by starting from an initial population of structures and generating new candidate structures by breeding the fittest structures in the population. Typically, the fitness function is based on relative lattice energies, such that structures with lower energies have a higher probability of being selected for mating. GAs may be adapted to perform multi-modal optimization by using evolutionary niching methods that support the formation of several stable subpopulations and suppress the over-sampling of densely populated regions. Evolutionary niching is implemented in the GAtor molecular crystal structure prediction code by using techniques from machine learning to dynamically cluster the population into niches of structural similarity. A cluster-based fitness function is constructed such that structures in less populated clusters have a higher probability of being selected for breeding. Here, the effects of evolutionary niching are investigated for the crystal structure prediction of 1,3-dibromo-2-chloro-5-fluorobenzene. Using the cluster-based fitness function increases the success rate of generating the experimental structure and additional low-energy structures with similar packing motifs. 
    more » « less
  4. During in silico crystal structure prediction of organic molecules, millions of candidate structures are often generated. These candidates must be compared to remove duplicates prior to further analysis ( e.g. optimization with electronic structure methods) and ultimately compared with structures determined experimentally. The agreement of predicted and experimental structures forms the basis of evaluating the results from the Cambridge Crystallographic Data Centre (CCDC) blind assessment of crystal structure prediction, which further motivates the pursuit of rigorous alignments. Evaluating crystal structure packings using coordinate root-mean-square deviation (RMSD) for N molecules (or N asymmetric units) in a reproducible manner requires metrics to describe the shape of the compared molecular clusters to account for alternative approaches used to prioritize selection of molecules. Described here is a flexible algorithm called Progressive Alignment of Crystals ( PAC ) to evaluate crystal packing similarity using coordinate RMSD and introducing the radius of gyration ( R g ) as a metric to quantify the shape of the superimposed clusters. It is shown that the absence of metrics to describe cluster shape adds ambiguity to the results of the CCDC blind assessments because it is not possible to determine whether the superposition algorithm has prioritized tightly packed molecular clusters ( i.e. to minimize R g ) or prioritized reduced RMSD ( i.e. via possibly elongated clusters with relatively larger R g ). For example, it is shown that when the PAC algorithm described here uses single linkage to prioritize molecules for inclusion in the superimposed clusters, the results are nearly identical to those calculated by the widely used program COMPACK . However, the lower R g values obtained by the use of average linkage are favored for molecule prioritization because the resulting RMSDs more equally reflect the importance of packing along each dimension. It is shown that the PAC algorithm is faster than COMPACK when using a single process and its utility for biomolecular crystals is demonstrated. Finally, parallel scaling up to 64 processes in the open-source code Force Field X is presented. 
    more » « less
  5. The two-step nucleation (TSN) theory and crystal structure prediction (CSP) techniques are two disjointed yet popular methods to predict nucleation rate and crystal structure, respectively. The TSN theory is a well-established mechanism to describe the nucleation of a wide range of crystalline materials in different solvents. However, it has never been expanded to predict the crystal structure or polymorphism. On the contrary, the existing CSP techniques only empirically account for the solvent effects. As a result, the TSN theory and CSP techniques continue to evolve as separate methods to predict two essential attributes of nucleation – rate and structure. Here we bridge this gap and show for the first time how a crystal structure is formed within the framework of TSN theory. A sequential desolvation mechanism is proposed in TSN, where the first step involves partial desolvation to form dense clusters followed by selective desolvation of functional groups directing the formation of crystal structure. We investigate the effect of the specific interaction on the degree of solvation around different functional groups of glutamic acid molecules using molecular simulations. The simulated energy landscape and activation barriers at increasing supersaturations suggest sequential and selective desolvation. We validate computationally and experimentally that the crystal structure formation and polymorph selection are due to a previously unrecognized consequence of supersaturation-driven asymmetric desolvation of molecules. 
    more » « less