NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Geometry and convergence of natural policy gradient methods

https://doi.org/10.1007/s41884-023-00106-z

Müller, Johannes; Montúfar, Guido (June 2023, Information Geometry)

Abstract We study the convergence of several natural policy gradient (NPG) methods in infinite-horizon discounted Markov decision processes with regular policy parametrizations. For a variety of NPGs and reward functions we show that the trajectories in state-action space are solutions of gradient flows with respect to Hessian geometries, based on which we obtain global convergence guarantees and convergence rates. In particular, we show linear convergence for unregularized and regularized NPG flows with the metrics proposed by Kakade and Morimura and co-authors by observing that these arise from the Hessian geometries of conditional entropy and entropy respectively. Further, we obtain sublinear convergence rates for Hessian geometries arising from other convex functions like log-barriers. Finally, we interpret the discrete-time NPG methods with regularized rewards as inexact Newton methods if the NPG is defined with respect to the Hessian geometry of the regularizer. This yields local quadratic convergence rates of these methods for step size equal to the inverse penalization strength.
more » « less
Enumeration of Max-Pooling Responses with Generalized Permutohedra

https://doi.org/10.1007/s00026-025-00782-x

Escobar, Laura; Gallardo, Patricio; González_Anaya, Javier; González, José L; Montúfar, Guido; Morales, Alejandro H (September 2025, Annals of Combinatorics)

Free, publicly-accessible full text available September 24, 2026
Fisher–Rao Gradient Flows of Linear Programs and State-Action Natural Policy Gradients

https://doi.org/10.1137/24M1653422

Müller, Johannes; Çayci, Semih; Montúfar, Guido (June 2025, SIAM Journal on Optimization)

Free, publicly-accessible full text available June 30, 2026
On the Local Complexity of Linear Regions in Deep ReLU Networks

Patel, Niket; Montufar, Guido (May 2025, International Conference on Machine Learning (ICML) 2025)

Free, publicly-accessible full text available May 1, 2026
IMPLICIT BIAS OF MIRROR FLOW FOR SHALLOW NEURAL NETWORKS IN UNIVARIATE REGRESSION

Liang, Shuang; Montufar, Guido (January 2025, The Thirteenth International Conference on Learning Representations (ICLR 2025))

We examine the implicit bias of mirror flow in least squares error regression with wide and shallow neural networks. For a broad class of potential functions, we show that mirror flow exhibits lazy training and has the same implicit bias as ordinary gradient flow when the network width tends to infinity. For univariate ReLU networks, we characterize this bias through a variational problem in function space. Our analysis includes prior results for ordinary gradient flow as a special case and lifts limitations which required either an intractable adjustment of the training data or networks with skip connections. We further introduce scaled potentials and show that for these, mirror flow still exhibits lazy training but is not in the kernel regime. For univariate networks with absolute value activations, we show that mirror flow with scaled potentials induces a rich class of biases, which generally cannot be captured by an RKHS norm. A takeaway is that whereas the parameter initialization determines how strongly the curvature of the learned function is penalized at different locations of the input space, the scaled potential determines how the different magnitudes of the curvature are penalized.
more » « less
Free, publicly-accessible full text available January 22, 2026
DEMYSTIFYING TOPOLOGICAL MESSAGE-PASSING WITH RELATIONAL STRUCTURES: A CASE STUDY ON OVERSQUASHING IN SIMPLICIAL MESSAGE-PASSING

Taha, Diaaeldin; Chapman, James; Eidi, Marzieh; Devriendt, Karel; Montufar, Guido (January 2025, International Conference on Learning Representations 2025)

Topological deep learning (TDL) has emerged as a powerful tool for modeling higher-order interactions in relational data. However, phenomena such as over- squashing in topological message-passing remain understudied and lack theoreti- cal analysis. We propose a unifying axiomatic framework that bridges graph and topological message-passing by viewing simplicial and cellular complexes and their message-passing schemes through the lens of relational structures. This ap- proach extends graph-theoretic results and algorithms to higher-order structures, facilitating the analysis and mitigation of oversquashing in topological message- passing networks. Through theoretical analysis and empirical studies on simplicial networks, we demonstrate the potential of this framework to advance TDL.
more » « less
Free, publicly-accessible full text available January 22, 2026
IMPLICIT BIAS OF MIRROR FLOW FOR SHALLOW NEURAL NETWORKS IN UNIVARIATE REGRESSION

Liang, Shuang; Montufar, Guido (January 2025, International Conference on Learning Representations 2025)

We examine the implicit bias of mirror flow in least squares error regression with wide and shallow neural networks. For a broad class of potential functions, we show that mirror flow exhibits lazy training and has the same implicit bias as ordinary gradient flow when the network width tends to infinity. For univariate ReLU networks, we characterize this bias through a variational problem in function space. Our analysis includes prior results for ordinary gradient flow as a special case and lifts limitations which required either an intractable adjustment of the training data or networks with skip connections. We further introduce scaled potentials and show that for these, mirror flow still exhibits lazy training but is not in the kernel regime. For univariate networks with absolute value activations, we show that mirror flow with scaled potentials induces a rich class of biases, which generally cannot be captured by an RKHS norm. A takeaway is that whereas the parameter initialization determines how strongly the curvature of the learned function is penalized at different locations of the input space, the scaled potential determines how the different magnitudes of the curvature are penalized.
more » « less
Free, publicly-accessible full text available January 22, 2026
Implicit Bias of Mirror Flow for Shallow Neural Networks in Univariate Regression

Liang, Shuang; Montufar, Guido (January 2025, International Conference on Learning Representations (ICLR) 2025)

Free, publicly-accessible full text available January 22, 2026
Bounds for the smallest eigenvalue of the NTK for arbitrary spherical data of arbitrary dimension

Karhadkar, Kedar; Murray, Michael; Montufar, Guido (September 2024, 38th Conference on Neural Information Processing Systems (NeurIPS 2024))

Bounds on the smallest eigenvalue of the neural tangent kernel (NTK) are a key ingredient in the analysis of neural network optimization and memorization. How- ever, existing results require distributional assumptions on the data and are limited to a high-dimensional setting, where the input dimension d0 scales at least log- arithmically in the number of samples n. In this work we remove both of these requirements and instead provide bounds in terms of a measure of distance between data points: notably these bounds hold with high probability even when d0 is held constant versus n. We prove our results through a novel application of the hemisphere transform.
more » « less
Full Text Available
Benign overfitting in leaky ReLU networks with moderate input dimension

Karhadkar, Kedar; George, Erin; Murray, Michael; Montufar, Guido; Needell, Deanna (September 2024, 38th Conference on Neural Information Processing Systems (NeurIPS 2024))

The problem of benign overfitting asks whether it is possible for a model to perfectly fit noisy training data and still generalize well. We study benign overfitting in two- layer leaky ReLU networks trained with the hinge loss on a binary classification task. We consider input data that can be decomposed into the sum of a common signal and a random noise component, that lie on subspaces orthogonal to one another. We characterize conditions on the signal to noise ratio (SNR) of the model parameters giving rise to benign versus non-benign (or harmful) overfitting: in particular, if the SNR is high then benign overfitting occurs, conversely if the SNR is low then harmful overfitting occurs. We attribute both benign and non- benign overfitting to an approximate margin maximization property and show that leaky ReLU networks trained on hinge loss with gradient descent (GD) satisfy this property. In contrast to prior work we do not require the training data to be nearly orthogonal. Notably, for input dimension d and training sample size n, while results in prior work require d= !(n2 log n), here we require only d= ! (n).
more » « less
Full Text Available

« Prev Next »

Search for: All records