NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Pivotal Auto-Encoder via Self-Normalizing ReLU

https://doi.org/10.1109/TSP.2024.3418971

Goldenstein, Nelson; Sulam, Jeremias; Romano, Yaniv (January 2024, IEEE Transactions on Signal Processing)

Sparse auto-encoders are useful for extracting low-dimensional representations from high-dimensional data. However, their performance degrades sharply when the input noise at test time differs from the noise employed during training. This limitation hinders the applicability of auto-encoders in real-world scenarios where the level of noise in the input is unpredictable. In this paper, we formalize single hidden layer sparse auto-encoders as a transform learning problem. Leveraging the transform modeling interpretation, we propose an optimization problem that leads to a predictive model invariant to the noise level at test time. In other words, the same pre-trained model is able to generalize to different noise levels. The proposed optimization algorithm, derived from the square root lasso, is translated into a new, computationally efficient auto-encoding architecture. After proving that our new method is invariant to the noise level, we evaluate our approach by training networks using the proposed architecture for denoising tasks. Our experimental results demonstrate that the trained models yield a significant improvement in stability against varying types of noise compared to commonly used architectures.
more » « less
Full Text Available
Adversarial Robustness of Sparse Local Lipschitz Predictors

https://doi.org/10.1137/22M1478835

Muthukumar, Ramchandran; Sulam, Jeremias (December 2023, SIAM Journal on Mathematics of Data Science)

This work studies the adversarial robustness of parametric functions composed of a linear predic-tor and a nonlinear representation map. Our analysis relies on sparse local Lipschitzness (SLL),an extension of local Lipschitz continuity that better captures the stability and reduced effectivedimensionality of predictors upon local perturbations. SLL functions preserve a certain degree ofstructure, given by the sparsity pattern in the representation map, and include several popular hy-pothesis classes, such as piecewise linear models, Lasso and its variants, and deep feedforward ReLUnetworks. Compared with traditional Lipschitz analysis, we provide a tighter robustness certificateon the minimal energy of an adversarial example, as well as tighter data-dependent nonuniformbounds on the robust generalization error of these predictors. We instantiate these results for the case of deep neural networks and provide numerical evidence that supports our results, shedding new insights into natural regularization strategies to increase the robustness of these models.
more » « less
Full Text Available
Sparsity-aware generalization theory for deep neural networks

Muthukumar, Ramchandran; Sulam, Jeremias (July 2023, Proceedings of Machine Learning Research)

Deep artificial neural networks achieve surprising generalization abilities that remain poorly under- stood. In this paper, we present a new approach to analyzing generalization for deep feed-forward ReLU networks that takes advantage of the degree of sparsity that is achieved in the hidden layer activations. By developing a framework that accounts for this reduced effective model size for each input sample, we are able to show fundamental trade-offs between sparsity and generalization. Importantly, our results make no strong assumptions about the degree of sparsity achieved by the model, and it improves over recent norm-based approaches. We illustrate our results numerically, demonstrating non-vacuous bounds when coupled with data-dependent priors in specific settings, even in over-parametrized models.
more » « less
Full Text Available
Recovery and Generalization in Over-Realized Dictionary Learning

Sulam, J.; You, C.; Zhu, Z. (May 2022, Journal of machine learning research)

Full Text Available
Entrywise Recovery Guarantees for Sparse PCA via Sparsistent Algorithms

Agterberg, J.; Sulam, J. (January 2022, Proceedings of Machine Learning Research)

Full Text Available
A Geometric Analysis of Neural Collapse with Unconstrained Features

Zhu, Z.; Ding, T.; Zhou, J.; Li, X.; You, C.; Sulam, J.; Qu, Q. (January 2021, Advances in neural information processing systems)

Full Text Available

Search for: All records