Abstract We develop a general theory of flows in the space of Riemannian metrics induced by neural network (NN) gradient descent. This is motivated in part by recent advances in approximating Calabi–Yau metrics with NNs and is enabled by recent advances in understanding flows in the space of NNs. We derive the corresponding metric flow equations, which are governed by a metric neural tangent kernel (NTK), a complicated, non-local object that evolves in time. However, many architectures admit an infinite-width limit in which the kernel becomes fixed and the dynamics simplify. Additional assumptions can induce locality in the flow, which allows for the realization of Perelman’s formulation of Ricci flow that was used to resolve the 3d Poincaré conjecture. We demonstrate that such fixed kernel regimes lead to poor learning of numerical Calabi–Yau metrics, as is expected since the associated NNs do not learn features. Conversely, we demonstrate that well-learned numerical metrics at finite-width exhibit an evolving metric-NTK, associated with feature learning. Our theory of NN metric flows therefore explains why NNs are better at learning Calabi–Yau metrics than fixed kernel methods, such as the Ricci flow.
more »
« less
Rigor with machine learning from field theory to the Poincaré conjecture
Despite their successes, machine learning techniques are often stochastic, error-prone and blackbox. How could they then be used in fields such as theoretical physics and pure mathematics for which error-free results and deep understanding are a must? In this Perspective, we discuss techniques for obtaining zero-error results with machine learning, with a focus on theoretical physics and pure mathematics. Non-rigorous methods can enable rigorous results via conjecture generation or verification by reinforcement learning. We survey applications of these techniques-for-rigor ranging from string theory to the smooth 4D Poincaré conjecture in low-dimensional topology. We also discuss connections between machine learning theory and mathematics or theoretical physics such as a new approach to field theory motivated by neural network theory, and a theory of Riemannian metric flows induced by neural network gradient descent, which encompasses Perelman’s formulation of the Ricci flow that was used to solve the 3D Poincaré conjecture.
more »
« less
- PAR ID:
- 10510080
- Publisher / Repository:
- Nature Reviews Physics
- Date Published:
- Journal Name:
- Nature Reviews Physics
- Volume:
- 6
- Issue:
- 5
- ISSN:
- 2522-5820
- Page Range / eLocation ID:
- 310 to 319
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Efficient real-time solvers for forward and inverse problems are essential in engineering and science applications. Machine learning surrogate models have emerged as promising alter- natives to traditional methods, offering substantially reduced computational time. Never- theless, these models typically demand extensive training datasets to achieve robust gen- eralization across diverse scenarios. While physics-based approaches can partially mitigate this data dependency and ensure physics-interpretable solutions, addressing scarce data regimes remains a challenge. Both purely data-driven and physics-based machine learning approaches demonstrate severe overfitting issues when trained with insufficient data. We propose a novel model-constrained Tikhonov autoencoder neural network framework, called TAEN, capable of learning both forward and inverse surrogate models using a single arbitrary observational sample. We develop comprehensive theoretical foundations including forward and inverse inference error bounds for the proposed approach for linear cases. For compara- tive analysis, we derive equivalent formulations for pure data-driven and model-constrained approach counterparts. At the heart of our approach is a data randomization strategy with theoretical justification, which functions as a generative mechanism for exploring the train- ing data space, enabling effective training of both forward and inverse surrogate models even with a single observation, while regularizing the learning process. We validate our approach through extensive numerical experiments on two challenging inverse problems: 2D heat conductivity inversion and initial condition reconstruction for time-dependent 2D Navier–Stokes equations. Results demonstrate that TAEN achieves accuracy comparable to traditional Tikhonov solvers and numerical forward solvers for both inverse and forward problems, respectively, while delivering orders of magnitude computational speedups.more » « less
-
null (Ed.)Although the distributed machine learning methods can speed up the training of large deep neural networks, the communication cost has become the non-negligible bottleneck to constrain the performance. To address this challenge, the gradient compression based communication-efficient distributed learning methods were designed to reduce the communication cost, and more recently the local error feedback was incorporated to compensate for the corresponding performance loss. However, in this paper, we will show that a new "gradient mismatch" problem is raised by the local error feedback in centralized distributed training and can lead to degraded performance compared with full-precision training. To solve this critical problem, we propose two novel techniques, 1) step ahead and 2) error averaging, with rigorous theoretical analysis. Both our theoretical and empirical results show that our new methods can handle the "gradient mismatch" problem. The experimental results show that we can even train faster with common gradient compression schemes than both the full-precision training and local error feedback regarding the training epochs and without performance loss.more » « less
-
Modern machine learning has been on the rise in many scientific domains, such as acoustics. Many scientific problems face challenges with limited data, which prevent the use of the many powerful machine learning strategies. In response, the physics of wave-propagation can be exploited to reduce the amount of data necessary and improve performance of machine learning techniques. Based on this need, we present a physics-informed machine learning framework, known as wave-informed regression, to extract dispersion curves from a guided wave wavefield data from non-homogeneous media. Wave-informed regression blends matrix factorization with known wave-physics by borrowing results from optimization theory. We briefly derive the algorithm and discuss a signal processing-based interpretability aspect of it, which aids in extracting dispersion curves for non-homogenous media. We show our results on a non-homogeneous media, where the dispersion curves change as a function of space. We demonstrate our ability to use wave-informed regression to extract spatially local dispersion curves.more » « less
-
Machine learning approaches have recently been applied to the study of various problems in physics. Most of these studies are focused on interpreting the data generated by conventional numerical methods or the data on an existing experimental database. An interesting question is whether it is possible to use a machine learning approach, in particular a neural network, for solving the many-body problem. In this paper, we present a neural network solver for the single impurity Anderson model, the paradigm of an interacting quantum problem in small clusters. We demonstrate that the neural-network-based solver provides quantitative accurate results for the spectral function as compared to the exact diagonalization method. This opens the possibility of utilizing the neural network approach as an impurity solver for other many-body numerical approaches, such as the dynamical mean field theory.more » « less
An official website of the United States government

