Abstract Computational modeling of chemical and biological systems at atomic resolution is a crucial tool in the chemist’s toolset. The use of computer simulations requires a balance between cost and accuracy: quantum-mechanical methods provide high accuracy but are computationally expensive and scale poorly to large systems, while classical force fields are cheap and scalable, but lack transferability to new systems. Machine learning can be used to achieve the best of both approaches. Here we train a general-purpose neural network potential (ANI-1ccx) that approaches CCSD(T)/CBS accuracy on benchmarks for reaction thermochemistry, isomerization, and drug-like molecular torsions. This is achieved by training a network to DFT data then using transfer learning techniques to retrain on a dataset of gold standard QM calculations (CCSD(T)/CBS) that optimally spans chemical space. The resulting potential is broadly applicable to materials science, biology, and chemistry, and billions of times faster than CCSD(T)/CBS calculations.
more »
« less
Reducing the complexity of chemical networks via interpretable autoencoders
In many astrophysical applications, the cost of solving a chemical network represented by a system of ordinary differential equations (ODEs) grows significantly with the size of the network and can often represent a significant computational bottleneck, particularly in coupled chemo-dynamical models. Although standard numerical techniques and complex solutions tailored to thermochemistry can somewhat reduce the cost, more recently, machine learning algorithms have begun to attack this challenge via data-driven dimensional reduction techniques. In this work, we present a new class of methods that take advantage of machine learning techniques to reduce complex data sets (autoencoders), the optimization of multiparameter systems (standard backpropagation), and the robustness of well-established ODE solvers to to explicitly incorporate time dependence. This new method allows us to find a compressed and simplified version of a large chemical network in a semiautomated fashion that can be solved with a standard ODE solver, while also enabling interpretability of the compressed, latent network. As a proof of concept, we tested the method on an astrophysically relevant chemical network with 29 species and 224 reactions, obtaining a reduced but representative network with only 5 species and 12 reactions, and an increase in speed by a factor 65.
more »
« less
- Award ID(s):
- 1910106
- PAR ID:
- 10428845
- Date Published:
- Journal Name:
- Astronomy & Astrophysics
- Volume:
- 668
- ISSN:
- 0004-6361
- Page Range / eLocation ID:
- A139
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Residual neural networks can be viewed as the forward Euler discretization of an Ordinary Differential Equation (ODE) with a unit time step. This has recently motivated researchers to explore other discretization approaches and train ODE based networks. However, an important challenge of neural ODEs is their prohibitive memory cost during gradient backpropogation. Recently a method proposed in arXiv:1806.07366, claimed that this memory overhead can be reduced from LNt, where Nt is the number of time steps, down to O(L) by solving forward ODE backwards in time, where L is the depth of the network. However, we will show that this approach may lead to several problems: (i) it may be numerically unstable for ReLU/non-ReLU activations and general convolution operators, and (ii) the proposed optimize-then-discretize approach may lead to divergent training due to inconsistent gradients for small time step sizes. We discuss the underlying problems, and to address them we propose ANODE, a neural ODE framework which avoids the numerical instability related problems noted above. ANODE has a memory footprint of O(L) + O(Nt), with the same computational cost as reversing ODE solve. We furthermore, discuss a memory efficient algorithm which can further reduce this footprint with a tradeoff of additional computational cost. We show results on Cifar-10/100 datasets using ResNet and SqueezeNext neural networks.more » « less
-
Abstract This study introduces a novel convolutional neural network (CNN)‐based approach for structural health monitoring (SHM) that exploits a form of measured compressed response data through transfer learning (TL)‐based techniques. The implementation of the proposed methodology allows damage identification and localization within a realistic large‐scale system. To validate the proposed method, first, a well‐known benchmark model is numerically simulated. Using acceleration response histories, as well as compressed response data in terms of discrete histograms, CNN models are trained, and the robustness of the CNN architectures is evaluated. Finally, pretrained CNNs are fine‐tuned to be adaptable for three‐parameter, extremely compressed response data, based on the response mean, standard deviation, and a scale factor. The performance of each CNN implementation is assessed using training accuracy histories as well as confusion matrices, along with other performance metrics. In addition to the numerical study, the performance of the proposed method is demonstrated using experimental vibration response data for verification and validation. The results indicate that deep TL can be implemented effectively for SHM of similar structural systems with different types of sensors.more » « less
-
Abstract Stereoselective reactions have played a vital role in the emergence of life, evolution, human biology, and medicine. However, for a long time, most industrial and academic efforts followed a trial-and-error approach for asymmetric synthesis in stereoselective reactions. In addition, most previous studies have been qualitatively focused on the influence of steric and electronic effects on stereoselective reactions. Therefore, quantitatively understanding the stereoselectivity of a given chemical reaction is extremely difficult. As proof of principle, this paper develops a novel composite machine learning method for quantitatively predicting the enantioselectivity representing the degree to which one enantiomer is preferentially produced from the reactions. Specifically, machine learning methods that are widely used in data analytics, including Random Forest, Support Vector Regression, and LASSO, are utilized. In addition, the Bayesian optimization and permutation importance tests are provided for an in-depth understanding of reactions and accurate prediction. Finally, the proposed composite method approximates the key features of the available reactions by using Gaussian mixture models, which provide suitable machine learning methods for new reactions. The case studies using the real stereoselective reactions show that the proposed method is effective and provides a solid foundation for further application to other chemical reactions.more » « less
-
IEEE 802.15.4-based industrial wireless sensor-actuator networks (WSANs) have been widely deployed to connect sensors, actuators, and controllers in industrial facilities. Configuring an industrial WSAN to meet the application-specified quality of service (QoS) requirements is a complex process, which involves theoretical computation, simulation, and field testing, among other tasks. Since industrial wireless networks become increasingly hierarchical, heterogeneous, and complex, many research efforts have been made to apply wireless simulations and advanced machine learning techniques for network configuration. Unfortunately, our study shows that the network configuration model generated by the state-of-the-art method decays quickly over time. To address this issue, we develop aMEta-learning basedRuntimeAdaptation (MERA) method that efficiently adapts network configuration models for industrial WSANs at runtime. Under MERA, the parameters of the network configuration model are explicitly trained such that a small number of optimization steps with only a few new measurements will produce good generalization performance after the network condition changes. We also develop a data sampling method to reduce the measurements required by MERA at runtime without sacrificing its performance. Experimental results show that MERA achieves higher prediction accuracy with less physical measurements, less computation time, and longer adaptation intervals compared to a state-of-the-art baseline.more » « less
An official website of the United States government

