NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

The Common Intuition to Transfer Learning Can Win or Lose: Case Studies for Linear Regression

https://doi.org/10.1137/23M1563062

Dar, Yehuda; LeJeune, Daniel; Baraniuk, Richard G (June 2024, SIAM Journal on Mathematics of Data Science)

Full Text Available
Double Double Descent: On Generalization Errors in Transfer Learning between Linear Regression Tasks

https://doi.org/10.1137/22M1469559

Dar, Yehuda; Baraniuk, Richard G. (December 2022, SIAM Journal on Mathematics of Data Science)

Full Text Available
Can Neural Nets Learn the Same Model Twice? Investigating Reproducibility and Double Descent from the Decision Boundary Perspective

https://doi.org/10.1109/CVPR52688.2022.01333

Somepalli, Gowthami; Fowl, Liam; Bansal, Arpit; Yeh-Chiang, Ping; Dar, Yehuda; Baraniuk, Richard; Goldblum, Micah; Goldstein, Tom (June 2022, CVPR)

Full Text Available
Double Double Descent: On Generalization Errors in Transfer Learning between Linear Regression Tasks

Dar, Yehuda; Baraniuk, Richard G. (June 2021, ArXivorg)
null (Ed.)
We study the transfer learning process between two linear regression problems. An important and timely special case is when the regressors are overparameterized and perfectly interpolate their training data. We examine a parameter transfer mechanism whereby a subset of the parameters of the target task solution are constrained to the values learned for a related source task. We analytically characterize the generalization error of the target task in terms of the salient factors in the transfer learning architecture, i.e., the number of examples available, the number of (free) parameters in each of the tasks, the number of parameters transferred from the source to target task, and the correlation between the two tasks. Our non-asymptotic analysis shows that the generalization error of the target task follows a two-dimensional double descent trend (with respect to the number of free parameters in each of the tasks) that is controlled by the transfer learning factors. Our analysis points to specific cases where the transfer of parameters is beneficial. Specifically, we show that transferring a specific set of parameters that generalizes well on the respective part of the source task can soften the demand on the task correlation level that is required for successful transfer learning. Moreover, we show that the usefulness of a transfer learning setting is fragile and depends on a delicate interplay among the set of transferred parameters, the relation between the tasks, and the true solution.
more » « less
Full Text Available
Double Descent and Other Interpolation Phenomena in GANs

Luzi, Lorenzo; Dar, Yehuda; Baraniuk, Richard (June 2021, ArXivorg)
null (Ed.)
We study overparameterization in generative adversarial networks (GANs) that can interpolate the training data. We show that overparameterization can improve generalization performance and accelerate the training process. We study the generalization error as a function of latent space dimension and identify two main behaviors, depending on the learning setting. First, we show that overparameterized generative models that learn distributions by minimizing a metric or f-divergence do not exhibit double descent in generalization errors; specifically, all the interpolating solutions achieve the same generalization error. Second, we develop a new pseudo-supervised learning approach for GANs where the training utilizes pairs of fabricated (noise) inputs in conjunction with real output samples. Our pseudo-supervised setting exhibits double descent (and in some cases, triple descent) of generalization errors. We combine pseudo-supervision with overparameterization (i.e., overly large latent space dimension) to accelerate training while performing better, or close to, the generalization performance without pseudo-supervision. While our analysis focuses mostly on linear GANs, we also apply important insights for improving generalization of nonlinear, multilayer GANs.
more » « less
Full Text Available
Subspace Fitting Meets Regression: The Effects of Supervision and Orthonormality Constraints on Double Descent of Generalization Errors

Dar, Yehuda; Mayer, Paul; Luzi, Lorenzo; Baraniuk, Richard G. (July 2020, Proceedings of Machine Learning Research)
null (Ed.)
We study the linear subspace fitting problem in the overparameterized setting, where the estimated subspace can perfectly interpolate the training examples. Our scope includes the least-squares solutions to subspace fitting tasks with varying levels of supervision in the training data (i.e., the proportion of input-output examples of the desired low-dimensional mapping) and orthonormality of the vectors defining the learned operator. This flexible family of problems connects standard, unsupervised subspace fitting that enforces strict orthonormality with a corresponding regression task that is fully supervised and does not constrain the linear operator structure. This class of problems is defined over a supervision-orthonormality plane, where each coordinate induces a problem instance with a unique pair of supervision level and softness of orthonormality constraints. We explore this plane and show that the generalization errors of the corresponding subspace fitting problems follow double descent trends as the settings become more supervised and less orthonormally constrained.
more » « less
Full Text Available

Search for: All records