NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Disentangling Safe and Unsafe Image Corruptions via Anisotropy and Locality

Ramchandran, Muthukumar; Pal, Ambar; Sulam, Jeremias Sulam; Vidal, Rene (June 2025, Proceedings of the Computer Vision and Pattern Recognition Conference. 2025)

Free, publicly-accessible full text available June 11, 2026
Concept Bottleneck Model with Zero Performance Loss

Wang, Zhenzhen; Popel, Aleksander; Sulam, Jeremias (March 2025, The Second Conference on Parsimony and Learning (CPAL 2025))

Free, publicly-accessible full text available March 25, 2026
Sufficient and Necessary Explanations (and What Lies in Between)

Bharti, Beepul; Yi, Paul H; Sulam, Jeremias (March 2025, The Second Conference on Parsimony and Learning (CPAL 2025))

Free, publicly-accessible full text available March 25, 2026
Testing Semantic Importance via Betting

Teneggi, Jacopo; Sulam, Jeremias (February 2025, Advances in Neural Information Processing Systems 37 (NeurIPS 2024))

Free, publicly-accessible full text available February 5, 2026
Adversarial Examples Might be Avoidable: The Role of Data Concentration in Adversarial Robustness

Pal, Ambar; Sulam, Jeremias; Vidal, Rene (May 2024, NeurIPS)
Oh, A; Naumann, T; Globerson, A; Saenko, K; Hardt, M; Levine, S (Ed.)
The susceptibility of modern machine learning classifiers to adversarial examples has motivated theoretical results suggesting that these might be unavoidable. However, these results can be too general to be applicable to natural data distributions. Indeed, humans are quite robust for tasks involving vision. This apparent conflict motivates a deeper dive into the question: Are adversarial examples truly unavoidable? In this work, we theoretically demonstrate that a key property of the data distribution – concentration on small-volume subsets of the input space – determines whether a robust classifier exists. We further demonstrate that, for a data distribution concentrated on a union of low-dimensional linear subspaces, utilizing structure in data naturally leads to classifiers that enjoy data-dependent polyhedral robustness guarantees, improving upon methods for provable certification in certain regimes.
more » « less
Full Text Available
Adversarial Robustness of Sparse Local Lipschitz Predictors

https://doi.org/10.1137/22M1478835

Muthukumar, Ramchandran; Sulam, Jeremias (December 2023, SIAM Journal on Mathematics of Data Science)

This work studies the adversarial robustness of parametric functions composed of a linear predic-tor and a nonlinear representation map. Our analysis relies on sparse local Lipschitzness (SLL),an extension of local Lipschitz continuity that better captures the stability and reduced effectivedimensionality of predictors upon local perturbations. SLL functions preserve a certain degree ofstructure, given by the sparsity pattern in the representation map, and include several popular hy-pothesis classes, such as piecewise linear models, Lasso and its variants, and deep feedforward ReLUnetworks. Compared with traditional Lipschitz analysis, we provide a tighter robustness certificateon the minimal energy of an adversarial example, as well as tighter data-dependent nonuniformbounds on the robust generalization error of these predictors. We instantiate these results for the case of deep neural networks and provide numerical evidence that supports our results, shedding new insights into natural regularization strategies to increase the robustness of these models.
more » « less
Full Text Available
Pivotal Auto-Encoder via Self-Normalizing ReLU

https://doi.org/10.1109/TSP.2024.3418971

Goldenstein, Nelson; Sulam, Jeremias; Romano, Yaniv (January 2024, IEEE Transactions on Signal Processing)

Sparse auto-encoders are useful for extracting low-dimensional representations from high-dimensional data. However, their performance degrades sharply when the input noise at test time differs from the noise employed during training. This limitation hinders the applicability of auto-encoders in real-world scenarios where the level of noise in the input is unpredictable. In this paper, we formalize single hidden layer sparse auto-encoders as a transform learning problem. Leveraging the transform modeling interpretation, we propose an optimization problem that leads to a predictive model invariant to the noise level at test time. In other words, the same pre-trained model is able to generalize to different noise levels. The proposed optimization algorithm, derived from the square root lasso, is translated into a new, computationally efficient auto-encoding architecture. After proving that our new method is invariant to the noise level, we evaluate our approach by training networks using the proposed architecture for denoising tasks. Our experimental results demonstrate that the trained models yield a significant improvement in stability against varying types of noise compared to commonly used architectures.
more » « less
Full Text Available
Examination-Level Supervision for Deep Learning–based Intracranial Hemorrhage Detection on Head CT Scans

https://doi.org/10.1148/ryai.230159

Teneggi, Jacopo; Yi, Paul H; Sulam, Jeremias (January 2024, Radiology: Artificial Intelligence)

Full Text Available
SHAP-XRT: The Shapley Value Meets Conditional Independence Testing

Teneggi, Jacopo; Bharti, Beepul; Romano, Yaniv; Sulam, Jeremias (December 2023, Transactions on machine learning research)

The complex nature of artificial neural networks raises concerns on their reliability, trustworthiness, and fairness in real-world scenarios. The Shapley value---a solution concept from game theory---is one of the most popular explanation methods for machine learning models. More traditionally, from a statistical perspective, feature importance is defined in terms of conditional independence. So far, these two approaches to interpretability and feature importance have been considered separate and distinct. In this work, we show that Shapley-based explanation methods and conditional independence testing are closely related. We introduce the \textbf{SHAP}ley E\textbf{X}planation \textbf{R}andomization \textbf{T}est (SHAP-XRT), a testing procedure inspired by the Conditional Randomization Test (CRT) for a specific notion of local (i.e., on a sample) conditional independence. With it, we prove that for binary classification problems, the marginal contributions in the Shapley value provide lower and upper bounds to the expected p-values of their respective tests. Furthermore, we show that the Shapley value itself provides an upper bound to the expected p-value of a global (i.e., overall) null hypothesis. As a result, we further our understanding of Shapley-based explanation methods from a novel perspective and characterize the conditions under which one can make statistically valid claims about feature importance via the Shapley value.
more » « less
Full Text Available
Sparsity-aware generalization theory for deep neural networks

Muthukumar, Ramchandran; Sulam, Jeremias (July 2023, Proceedings of Machine Learning Research)

Deep artificial neural networks achieve surprising generalization abilities that remain poorly under- stood. In this paper, we present a new approach to analyzing generalization for deep feed-forward ReLU networks that takes advantage of the degree of sparsity that is achieved in the hidden layer activations. By developing a framework that accounts for this reduced effective model size for each input sample, we are able to show fundamental trade-offs between sparsity and generalization. Importantly, our results make no strong assumptions about the degree of sparsity achieved by the model, and it improves over recent norm-based approaches. We illustrate our results numerically, demonstrating non-vacuous bounds when coupled with data-dependent priors in specific settings, even in over-parametrized models.
more » « less
Full Text Available

« Prev Next »

Search for: All records