Active Learning for Adsorption Simulations: Evaluation, Criteria Analysis, and Recommendations for Metal–Organic Frameworks

Osaro, Etinosa; Mukherjee, Krishnendu; Colón, Yamil J.

doi:10.1021/acs.iecr.3c01589

High-throughput molecular simulations and machine learning (ML) have been implemented to adequately screen a large number of metal−organic frameworks (MOFs) for applications involving adsorption. Grand canonical Monte Carlo (GCMC) simulations have proven effective in calculating the adsorption capacity at given pressures and temperatures, but they can require expensive computational resources. While they can be resource-efficient, ML models can require large datasets, creating a need for algorithms that can efficiently characterize adsorption; active learning (AL) can play a very important role in this regard. In this work, we make use of Gaussian process regression (GPR) to model pure component adsorption of nitrogen at 77 K from 10−5 to 1 bar, methane at 298 K from 10 −5 to 100 bar, carbon dioxide at 298 K from 10−5 to 100 bar, and hydrogen at 77 K from 10−5 to 100 bar on PCN-61, MgMOF-74, DUT-32, DUT-49, MOF-177, NU-800, UiO-66, ZIF-8, IRMOF-1, IRMOF-10, and IRMOF-16. The GPR model requires an initial training of the model with an initial dataset, the prior one, and, in this study of evaluating AL, we make use of three different prior selection schemes. Each prior scheme is updated with a sampling point resulting from the GP model uncertainties. This protocol continues until a maximum GPR relative error of 2% is attained. We make a recommendation on the best prior selection scheme for the total 44 adsorbate−adsorbent pairs primarily making use of the mean absolute error and the total amount of points required for convergence of the model. To further evaluate the AL framework, we apply the BET consistency criteria on the simulated and GP nitrogen isotherms and compare the resulting surface areas.

More Like this