Discrete Feature Representations of CHO Reaction Mechanisms as Quasireaction Subgraphs

Rappoport

doi:10.5281/zenodo.7905294

{"Abstract":["This data set contains 194778 quasireaction subgraphs extracted from CHO transition networks with 2-6 non-hydrogen atoms (CxHyOz, 2 <= x + z <= 6).<\/p>\n\nThe complete table of subgraphs (including file locations) is in CHO-6-atoms-subgraphs.csv file. The subgraphs are in GraphML format (http://graphml.graphdrawing.org) and are compressed using bzip2. All subgraphs are undirected and unweighted. The reactant and product nodes (initial and final) are labeled in the "type" node attribute. The nodes are represented as multi-molecule SMILES strings. The edges are labeled by the reaction rules in SMARTS representation. The forward and backward reading of the SMARTS string should be considered equivalent.<\/p>\n\nThe generation and analysis of this data set is described in\nD. Rappoport, Statistics and Bias-Free Sampling of Reaction Mechanisms from Reaction Network Models, 2023, submitted. Preprint at ChemrXiv, DOI: 10.26434/chemrxiv-2023-wltcr<\/p>\n\nSimulation parameters\n- CHO networks constructed using polar bond break/bond formation rule set for CHO.\n- High-energy nodes were excluded using the following rules:\n (i) more than 3 rings, (ii) triple and allene bonds in rings, (iii) double bonds at\n bridge atoms,(iv) double bonds in fused 3-membered rings.\n- Neutral nodes were defined as containing only neutral molecules.\n- Shortest path lengths were determined for all pairs of neutral nodes.\n- Pairs of neutral nodes with shortest-path length > 8 were excluded.\n- Additionally, pairs of neutral nodes connected only by shortest paths passing through\n additional neutral nodes (reducible paths) were excluded.<\/p>\n\nFor background and additional details, see paper above.<\/p>"],"Other":["This work was supported in part by the National Science Foundation under Grant No. CHE-2227112."]}

More Like this