Identifying thermodynamically stable crystal structures remains a key challenge in materials chemistry. Computational crystal structure prediction (CSP) workflows typically rank candidate structures by lattice energy to assess relative stability. Approaches using self-consistent first-principles calculations become prohibitively expensive, especially when millions of energy evaluations are required for complex molecular systems with many atoms per unit cell. Here, we provide a detailed analysis of our methodology and results from the seventh blind test of crystal structure prediction organized by the Cambridge Crystallographic Data Centre (CCDC). We present an approach that significantly accelerates CSP by training target-specific machine learned interatomic potentials (MLIPs). AIMNet2 MLIPs are trained on density functional theory (DFT) calculations of molecular clusters, herein referred to as n-mers. We demonstrate that potentials trained on gas phase dispersion-corrected DFT reference data of n-mers successfully extend to crystalline environments, accurately characterizing the CSP landscape and correctly ranking structures by relative stability. Our methodology effectively captures the underlying physics of thermodynamic crystal stability using only molecular cluster data, avoiding the need for expensive periodic calculations. The performance of target-specific AIMNet2 interatomic potentials is illustrated across diverse chemical systems relevant to pharmaceutical, optoelectronic, and agrochemical applications, demonstrating their promise as efficient alternatives to full DFT calculations for routine CSP tasks. 
                        more » 
                        « less   
                    This content will become publicly available on June 30, 2026
                            
                            Genarris 3.0: Generating Close-Packed Molecular Crystal Structures with Rigid Press
                        
                    
    
            Polymorphism in molecular crystals influences their properties and performance. Crystal structure prediction (CSP) can help explore the crystal structure landscape and discover potentially stable polymorphs computationally. We present a new version of the Genarris open-source code, which generates random molecular crystal structures in all space groups and applies physical constraints on intermolecular distances. The main new feature in Genarris 3.0 is the ``Rigid Press algorithm, which uses a regularized hard-sphere potential to compress the unit cell and achieve a maximally close-packed structure based on purely geometric considerations without performing any energy evaluations. In addition, Genarris 3.0 is interfaced with machine-learned interatomic potentials (MLIPs) to accelerate the exploration of the potential energy landscape. We present a new clustering and down-selection workflow that employs the MACE-OFF23(L) MLIPs to perform geometry optimization and energy ranking in the early stages. We use Genarris 3.0 to successfully predict the structure of six targets: aspirin, Target I and Target XXII from previous CSP blind tests, and the energetic materials HMX, CL-20, and DNI. We further analyze the performance of MACE-OFF23(L) compared to dispersion-inclusive density functional theory (DFT) for geometry relaxation and energy ranking. We find significant variability in the performance of MACE-OFF23(L) across chemically diverse targets with particularly poor performance for energetic materials, which is mitigated by our clustering and down-selection procedure. Genarris 3.0 can thus be used effectively to perform CSP and to generate molecular crystal datasets for training ML models. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2131944
- PAR ID:
- 10610989
- Publisher / Repository:
- ChemRxiv
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            The two-step nucleation (TSN) theory and crystal structure prediction (CSP) techniques are two disjointed yet popular methods to predict nucleation rate and crystal structure, respectively. The TSN theory is a well-established mechanism to describe the nucleation of a wide range of crystalline materials in different solvents. However, it has never been expanded to predict the crystal structure or polymorphism. On the contrary, the existing CSP techniques only empirically account for the solvent effects. As a result, the TSN theory and CSP techniques continue to evolve as separate methods to predict two essential attributes of nucleation – rate and structure. Here we bridge this gap and show for the first time how a crystal structure is formed within the framework of TSN theory. A sequential desolvation mechanism is proposed in TSN, where the first step involves partial desolvation to form dense clusters followed by selective desolvation of functional groups directing the formation of crystal structure. We investigate the effect of the specific interaction on the degree of solvation around different functional groups of glutamic acid molecules using molecular simulations. The simulated energy landscape and activation barriers at increasing supersaturations suggest sequential and selective desolvation. We validate computationally and experimentally that the crystal structure formation and polymorph selection are due to a previously unrecognized consequence of supersaturation-driven asymmetric desolvation of molecules.more » « less
- 
            Abstract An inexpensive and reliable method for molecular crystal structure predictions (CSPs) has been developed. The new CSP protocol starts from a two-dimensional graph of crystal’s monomer(s) and utilizes no experimental information. Using results of quantum mechanical calculations for molecular dimers, an accurate two-body, rigid-monomer ab initio-based force field (aiFF) for the crystal is developed. Since CSPs with aiFFs are essentially as expensive as with empirical FFs, tens of thousands of plausible polymorphs generated by the crystal packing procedures can be optimized. Here we show the robustness of this protocol which found the experimental crystal within the 20 most stable predicted polymorphs for each of the 15 investigated molecules. The ranking was further refined by performing periodic density-functional theory (DFT) plus dispersion correction (pDFT+D) calculations for these 20 top-ranked polymorphs, resulting in the experimental crystal ranked as number one for all the systems studied (and the second polymorph, if known, ranked in the top few). Alternatively, the polymorphs generated can be used to improve aiFFs, which also leads to rank one predictions. The proposed CSP protocol should result in aiFFs replacing empirical FFs in CSP research.more » « less
- 
            A molecular crystal structure prediction (CSP) protocol used in the seventh blind test is presented. The seventh blind test was divided into two stages and included seven targets, with crystals containing from one to three molecules in asymmetric units, monomers built of up to 100 atoms, and all targets containing monomers with flexible degrees of freedom. Some targets were cocrystals and one target was a salt. These diverse targets were treated using a CSP protocol starting from finding the global and local minima conformations of the target molecule. Subsequently, anab initiotwo-body rigid-monomer six-dimensional force field (aiFF) was developed for the global-minimum conformer. These aiFFs were then used in CSPs consisting of packing and lattice-energy minimization stages. Flexible-monomer CSPs were used for some targets. To describe the intramonomer FF, either generic empirical FFs or reparametrized FFs of this type were used, with some parameters fitted toab initioenergies of monomers in the latter case. A novel packing procedure was applied for two targets in stage 1. The success rate in the structure generation stage was 15% in submission phase and 54% in post-submission phase, while the corresponding values in the structure rating stage were 33% and 89%. We conclude that the inexpensive conformer-based approach with rigid-monomer CSPs can be recommended for investigations of crystals with flexible monomers. An advantage of this protocol is that it is fully based on first-principles quantum mechanics and generates tailor-made FFs suitable for use in subsequent molecular dynamics simulations investigating temperature-dependent effects. However, empirical intramonomer FFs reparametrized usingab initiodata are not yet adequate for CSPs.more » « less
- 
            The goal of molecular crystal structure prediction (CSP) is to find all the plausible polymorphs for a given molecule. This requires performing global optimization over a high-dimensional search space. Genetic algorithms (GAs) perform global optimization by starting from an initial population of structures and generating new candidate structures by breeding the fittest structures in the population. Typically, the fitness function is based on relative lattice energies, such that structures with lower energies have a higher probability of being selected for mating. GAs may be adapted to perform multi-modal optimization by using evolutionary niching methods that support the formation of several stable subpopulations and suppress the over-sampling of densely populated regions. Evolutionary niching is implemented in the GAtor molecular crystal structure prediction code by using techniques from machine learning to dynamically cluster the population into niches of structural similarity. A cluster-based fitness function is constructed such that structures in less populated clusters have a higher probability of being selected for breeding. Here, the effects of evolutionary niching are investigated for the crystal structure prediction of 1,3-dibromo-2-chloro-5-fluorobenzene. Using the cluster-based fitness function increases the success rate of generating the experimental structure and additional low-energy structures with similar packing motifs.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
