NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Faster margin maximization rates for generic and adversarially robust optimization methods

https://doi.org/10.1007/s10107-025-02283-4

Wang, Guanghui; Hu, Zihao; Gentile, Claudio; Muthukumar, Vidya; Abernethy, Jacob (October 2025, Mathematical Programming)

Abstract First-order optimization methods tend to inherently favor certain solutions over others when minimizing an underdetermined training objective that has multiple global optima. This phenomenon, known asimplicit bias, plays a critical role in understanding the generalization capabilities of optimization algorithms. Recent research has revealed that in separable binary classification tasks gradient-descent-based methods exhibit an implicit bias for the$$\ell _2$$ $ℓ_{2}$ -maximal margin classifier. Similarly, generic optimization methods, such as mirror descent and steepest descent, have been shown to converge to maximal margin classifiers defined by alternative geometries. While gradient-descent-based algorithms provably achievefastimplicit bias rates, corresponding rates in the literature for generic optimization methods are relatively slow. To address this limitation, we present a series of state-of-the-art implicit bias rates for mirror descent and steepest descent algorithms. Our primary technique involves transforming a generic optimization algorithm into an online optimization dynamic that solves a regularized bilinear game, providing a unified framework for analyzing the implicit bias of various optimization methods. Our accelerated rates are derived by leveraging the regret bounds of online learning algorithms within this game framework. We then show the flexibility of this framework by analyzing the implicit bias inadversarial training, and again obtain significantly improved convergence rates.
more » « less
Full Text Available
Task shift: From classification to regression in overparameterized linear models

LaBonte, Tyler; Lai, Kuo-Wei; Muthukumar, Vidya (June 2025, International Conference on Artificial Intelligence and Statistics)

Full Text Available
Estimating stationary mass, frequency by frequency

Nakul, Milind; Muthukumar, Vidya; Pananjady, Ashwin (June 2025, Conference on Learning Theory)

Full Text Available
On the unreasonable effectiveness of last-layer retraining

Hill, John C; LaBonte, Tyler; Zhang, Xinchen; Muthukumar, Vidya (March 2025, ICLR Workshop on Spurious Correlations and Shortcut Learning)

Full Text Available
The group robustness is in the details: Revisiting finetuning under spurious correlations

LaBonte, Tyler; Hill, John C; Zhang, Xinchen; Muthukumar, Vidya; Kumar, Abhishek (January 2025, Neural Information Processing Systems)

Full Text Available
Precise asymptotics of reweighted least-squares algorithms for linear diagonal networks

Kaushik, Chiraag; Romberg, Justin; Muthukumar, Vidya (January 2025, Neural Information Processing Systems)

Full Text Available
Your contrastive learning problem is secretly a distribution alignment problem

Chen, Zihao; Lin, Chi-Heng; Liu, Ran; Xiao, Jingyun; Dyer, Eva L (December 2024, Advances in Neural Information Processing Systems)

Full Text Available
Just Wing It: Near-optimal estimation of missing mass in a Markovian sequence

Pananjady, Ashwin; Muthukumar, Vidya; Thangaraj, Andrew (October 2024, Journal of Machine Learning Research)

Full Text Available
The Group Robustness is in the Details: Revisiting Finetuning Under Spurious Correlations

LaBonte, Tyler; Hill, John Collins; Zhang, Xinchen; Kumar, Abhishek; Muthukumar, Vidya (September 2024, Neural Information Processing Systems 2024)

Full Text Available
Precise asymptotics of reweighted least-squares algorithms for linear diagonal networks

Kaushik, Chiraag; Romberg, Justin; Muthukumar, Vidya (September 2024, Neural Information Processing Systems 2024)

Full Text Available

« Prev Next »

Search for: All records