- Home
- Search Results
- Page 1 of 1
Search for: All records
-
Total Resources2
- Resource Type
-
0000000002000000
- More
- Availability
-
11
- Author / Contributor
- Filter by Author / Creator
-
-
Singh, Esha (2)
-
Wang, Yu-Xiang (2)
-
Qiao, Dan (1)
-
Sabach, Shoham (1)
-
Soudry, Daniel (1)
-
Zhang, Kaiqi (1)
-
#Tyler Phillips, Kenneth E. (0)
-
#Willis, Ciara (0)
-
& Abreu-Ramos, E. D. (0)
-
& Abramson, C. I. (0)
-
& Abreu-Ramos, E. D. (0)
-
& Adams, S.G. (0)
-
& Ahmed, K. (0)
-
& Ahmed, Khadija. (0)
-
& Aina, D.K. Jr. (0)
-
& Akcil-Okan, O. (0)
-
& Akuom, D. (0)
-
& Aleven, V. (0)
-
& Andrews-Larson, C. (0)
-
& Archibald, J. (0)
-
- Filter by Editor
-
-
& Spizer, S. M. (0)
-
& . Spizer, S. (0)
-
& Ahn, J. (0)
-
& Bateiha, S. (0)
-
& Bosch, N. (0)
-
& Brennan K. (0)
-
& Brennan, K. (0)
-
& Chen, B. (0)
-
& Chen, Bodong (0)
-
& Drown, S. (0)
-
& Ferretti, F. (0)
-
& Higgins, A. (0)
-
& J. Peters (0)
-
& Kali, Y. (0)
-
& Ruiz-Arias, P.M. (0)
-
& S. Spitzer (0)
-
& Sahin. I. (0)
-
& Spitzer, S. (0)
-
& Spitzer, S.M. (0)
-
(submitted - in Review for IEEE ICASSP-2024) (0)
-
-
Have feedback or suggestions for a way to improve these results?
!
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Deep neural networks are well-known for their generalization capabilities, largely attributed to optimizers’ ability to find "good" solutions in high-dimensional loss landscapes. This work aims to deepen the understanding of optimization specifically through the lens of loss landscapes. We propose a generalized framework for adaptive optimization that favors convergence to these "good" solutions. Our approach shifts the optimization paradigm from merely finding solutions quickly to discovering solutions that generalize well, establishing a careful balance between optimization efficiency and model generalization. We empirically validate our claims using two-layer, fully connected neural network with ReLU activation and demonstrate practical applicability through binary quantization of ResNets. Our numerical results demonstrate that these adaptive optimizers facilitate exploration leading to faster convergence speeds and narrow the generalization gap between stochastic gradient descent and other adaptive methods.more » « lessFree, publicly-accessible full text available March 24, 2026
-
Qiao, Dan; Zhang, Kaiqi; Singh, Esha; Soudry, Daniel; Wang, Yu-Xiang (, Advances in neural information processing systems)We study the generalization of two-layer ReLU neural networks in a univariate nonparametric regression problem with noisy labels. This is a problem where kernels (\emph{e.g.} NTK) are provably sub-optimal and benign overfitting does not happen, thus disqualifying existing theory for interpolating (0-loss, global optimal) solutions. We present a new theory of generalization for local minima that gradient descent with a constant learning rate can \emph{stably} converge to. We show that gradient descent with a fixed learning rate η can only find local minima that represent smooth functions with a certain weighted \emph{first order total variation} bounded by 1/η−1/2+O˜(σ+MSE‾‾‾‾‾√) where σ is the label noise level, MSE is short for mean squared error against the ground truth, and O˜(⋅) hides a logarithmic factor. Under mild assumptions, we also prove a nearly-optimal MSE bound of O˜(n−4/5) within the strict interior of the support of the n data points. Our theoretical results are validated by extensive simulation that demonstrates large learning rate training induces sparse linear spline fits. To the best of our knowledge, we are the first to obtain generalization bound via minima stability in the non-interpolation case and the first to show ReLU NNs without regularization can achieve near-optimal rates in nonparametric regression.more » « less
An official website of the United States government

Full Text Available