This content will become publicly available on May 1, 2025
- NSF-PAR ID:
- 10510080
- Publisher / Repository:
- Nature Reviews Physics
- Date Published:
- Journal Name:
- Nature Reviews Physics
- Volume:
- 6
- Issue:
- 5
- ISSN:
- 2522-5820
- Page Range / eLocation ID:
- 310 to 319
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Modern machine learning has been on the rise in many scientific domains, such as acoustics. Many scientific problems face challenges with limited data, which prevent the use of the many powerful machine learning strategies. In response, the physics of wave-propagation can be exploited to reduce the amount of data necessary and improve performance of machine learning techniques. Based on this need, we present a physics-informed machine learning framework, known as wave-informed regression, to extract dispersion curves from a guided wave wavefield data from non-homogeneous media. Wave-informed regression blends matrix factorization with known wave-physics by borrowing results from optimization theory. We briefly derive the algorithm and discuss a signal processing-based interpretability aspect of it, which aids in extracting dispersion curves for non-homogenous media. We show our results on a non-homogeneous media, where the dispersion curves change as a function of space. We demonstrate our ability to use wave-informed regression to extract spatially local dispersion curves.more » « less
-
Machine learning approaches have recently been applied to the study of various problems in physics. Most of these studies are focused on interpreting the data generated by conventional numerical methods or the data on an existing experimental database. An interesting question is whether it is possible to use a machine learning approach, in particular a neural network, for solving the many-body problem. In this paper, we present a neural network solver for the single impurity Anderson model, the paradigm of an interacting quantum problem in small clusters. We demonstrate that the neural-network-based solver provides quantitative accurate results for the spectral function as compared to the exact diagonalization method. This opens the possibility of utilizing the neural network approach as an impurity solver for other many-body numerical approaches, such as the dynamical mean field theory.more » « less
-
null (Ed.)Although the distributed machine learning methods can speed up the training of large deep neural networks, the communication cost has become the non-negligible bottleneck to constrain the performance. To address this challenge, the gradient compression based communication-efficient distributed learning methods were designed to reduce the communication cost, and more recently the local error feedback was incorporated to compensate for the corresponding performance loss. However, in this paper, we will show that a new "gradient mismatch" problem is raised by the local error feedback in centralized distributed training and can lead to degraded performance compared with full-precision training. To solve this critical problem, we propose two novel techniques, 1) step ahead and 2) error averaging, with rigorous theoretical analysis. Both our theoretical and empirical results show that our new methods can handle the "gradient mismatch" problem. The experimental results show that we can even train faster with common gradient compression schemes than both the full-precision training and local error feedback regarding the training epochs and without performance loss.more » « less
-
Lee, Jonghyun ; Darve, Eric F. ; Kitanidis, Peter K. ; Mahoney, Michael W. ; Karpatne, Anuj ; Farthing, Matthew W. ; Hesser, Tyler (Ed.)Modern design, control, and optimization often require multiple expensive simulations of highly nonlinear stiff models. These costs can be amortized by training a cheap surrogate of the full model, which can then be used repeatedly. Here we present a general data-driven method, the continuous time echo state network (CTESN), for generating surrogates of nonlinear ordinary differential equations with dynamics at widely separated timescales. We empirically demonstrate the ability to accelerate a physically motivated scalable model of a heating system by 98x while maintaining relative error of within 0.2 %. We showcase the ability for this surrogate to accurately handle highly stiff systems which have been shown to cause training failures with common surrogate methods such as Physics-Informed Neural Networks (PINNs), Long Short Term Memory (LSTM) networks, and discrete echo state networks (ESN). We show that our model captures fast transients as well as slow dynamics, while demonstrating that fixed time step machine learning techniques are unable to adequately capture the multi-rate behavior. Together this provides compelling evidence for the ability of CTESN surrogates to predict and accelerate highly stiff dynamical systems which are unable to be directly handled by previous scientific machine learning techniques.more » « less
-
The learning speed of feed-forward neural networks is notoriously slow and has presented a bottleneck in deep learning applications for several decades. For instance, gradient-based learning algorithms, which are used extensively to train neural networks, tend to work slowly when all of the network parameters must be iteratively tuned. To counter this, both researchers and practitioners have tried introducing randomness to reduce the learning requirement. Based on the original construction of Igelnik and Pao, single layer neural-networks with random input-to-hidden layer weights and biases have seen success in practice, but the necessary theoretical justification is lacking. In this study, we begin to fill this theoretical gap. We then extend this result to the non-asymptotic setting using a concentration inequality for Monte-Carlo integral approximations. We provide a (corrected) rigorous proof that the Igelnik and Pao construction is a universal approximator for continuous functions on compact domains, with approximation error squared decaying asymptotically like
O (1/n ) for the numbern of network nodes. We then extend this result to the non-asymptotic setting, proving that one can achieve any desired approximation error with high probability providedn is sufficiently large. We further adapt this randomized neural network architecture to approximate functions on smooth, compact submanifolds of Euclidean space, providing theoretical guarantees in both the asymptotic and non-asymptotic forms. Finally, we illustrate our results on manifolds with numerical experiments.