NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Multi-task Representation Learning for Fixed Budget Pure-Exploration in Linear and Bilinear Bandits

Mukherjee, Subhojyoti; Xie, Qiaomin; Nowak, Robert D (August 2025, Reinforcement Learning Journal)

Free, publicly-accessible full text available August 8, 2026
Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning

Mukherjee, Subhojyoti; Hanna, Josiah P; Xie, Qiaomin; Nowak, Robert D (August 2025, Reinforcement Learning Journal)

Free, publicly-accessible full text available August 8, 2026
Weighted variation spaces and approximation by shallow ReLU networks

https://doi.org/10.1016/j.acha.2024.101713

DeVore, Ronald; Nowak, Robert D; Parhi, Rahul; Siegel, J W (January 2025, Applied and Computational Harmonic Analysis)

Free, publicly-accessible full text available January 1, 2026
Variation Spaces for Multi-Output Neural Networks: Insights on Multi-Task Learning and Network Compression

Shenouda, Joseph; Parhi, Rahul; Lee, Kangwook; Nowak, Robert (June 2024, Journal of Machine Learning Research)

This paper introduces a novel theoretical framework for the analysis of vector-valued neu- ral networks through the development of vector-valued variation spaces, a new class of reproducing kernel Banach spaces. These spaces emerge from studying the regularization e↵ect of weight decay in training networks with activation functions like the rectified linear unit (ReLU). This framework o↵ers a deeper understanding of multi-output networks and their function-space characteristics. A key contribution of this work is the development of a representer theorem for the vector-valued variation spaces. This representer theorem estab- lishes that shallow vector-valued neural networks are the solutions to data-fitting problems over these infinite-dimensional spaces, where the network widths are bounded by the square of the number of training data. This observation reveals that the norm associated with these vector-valued variation spaces encourages the learning of features that are useful for multiple tasks, shedding new light on multi-task learning with neural networks. Finally, this paper develops a connection between weight-decay regularization and the multi-task lasso problem. This connection leads to novel bounds for layer widths in deep networks that depend on the intrinsic dimensions of the training data representations. This insight not only deepens the understanding of the deep network architectural requirements, but also yields a simple convex optimization method for deep neural network compression. The performance of this compression procedure is evaluated on various architectures.
more » « less
Full Text Available
ReLUs Are Sufficient for Learning Implicit Neural Representations

Shenouda, Joseph; Zhou, Yamin; Nowak, Robert D (May 2024, International Conference on Machine Learning)

Motivated by the growing theoretical understanding of neural networks that employ the Rectified Linear Unit (ReLU) as their activation function, we revisit the use of ReLU activation functions for learning implicit neural representations (INRs). Inspired by second order B-spline wavelets, we incorporate a set of simple constraints to the ReLU neurons in each layer of a deep neural network (DNN) to remedy the spectral bias. This in turn enables its use for various INR tasks. Empirically, we demonstrate that, contrary to popular belief, one can learn state-of-the-art INRs based on a DNN composed of only ReLU neurons. Next, by leveraging recent theoretical works which characterize the kinds of functions ReLU neural networks learn, we provide a way to quantify the regularity of the learned function. This offers a principled approach to selecting the hyperparameters in INR architectures. We substantiate our claims through experiments in signal representation, super resolution, and computed tomography, demonstrating the versatility and effectiveness of our method. The code for all experiments can be found at https://github.com/joeshenouda/relu-inrs.
more » « less
Full Text Available
SPEED: Experimental Design for Policy Evaluation in Linear Heteroscedastic Bandits

Mukherjee, Subhojyoti; Xie, Qiaomin; Hanna, Josiah; Nowak, Robert (May 2024, Proceedings of Machine Learning Research)

Full Text Available
SaVeR: Optimal Data Collection Strategy for Safe Policy Evaluation in Tabular MDP

Mukherjee, Subhojyoti; Hanna, Josiah P; Nowak, Robert D (May 2024, International Conference on Machine Learning)

In this paper, we study safe data collection for the purpose of policy evaluation in tabular Markov decision processes (MDPs). In policy evaluation, we are given a target policy and asked to estimate the expected cumulative reward it will obtain. Policy evaluation requires data and we are interested in the question of what behavior policy should collect the data for the most accurate evaluation of the target policy. While prior work has considered behavior policy selection, in this paper, we additionally consider a safety constraint on the behavior policy. Namely, we assume there exists a known default policy that incurs a particular expected cost when run and we enforce that the cumulative cost of all behavior policies ran is better than a constant factor of the cost that would be incurred had we always run the default policy. We first show that there exists a class of intractable MDPs where no safe oracle algorithm with knowledge about problem parameters can efficiently collect data and satisfy the safety constraints. We then define the tractability condition for an MDP such that a safe oracle algorithm can efficiently collect data and using that we prove the first lower bound for this setting. We then introduce an algorithm SaVeR for this problem that approximates the safe oracle algorithm and bound the finite-sample mean squared error of the algorithm while ensuring it satisfies the safety constraint. Finally, we show in simulations that SaVeR produces low MSE policy evaluation while satisfying the safety constraint.
more » « less
Full Text Available
SPEED: Experimental Design for Policy Evaluation in Linear Heteroscedastic Bandits

Mukherjee, Subhojyoti; Xie, Qiaomin; Hanna, Josiah P; Nowak, Robert (April 2024, International Conference on Artificial Intelligence and Statistics)

In this paper, we study the problem of optimal data collection for policy evaluation in linear bandits. In policy evaluation, we are given a \textit{target} policy and asked to estimate the expected reward it will obtain when executed in a multi-armed bandit environment. Our work is the first work that focuses on such an optimal data collection strategy for policy evaluation involving heteroscedastic reward noise in the linear bandit setting. We first formulate an optimal design for weighted least squares estimates in the heteroscedastic linear bandit setting with the knowledge of noise variances. This design minimizes the mean squared error (MSE) of the estimated value of the target policy and is termed the oracle design. Since the noise variance is typically unknown, we then introduce a novel algorithm, SPEED (\textbf{S}tructured \textbf{P}olicy \textbf{E}valuation \textbf{E}xperimental \textbf{D}esign), that tracks the oracle design and derive its regret with respect to the oracle design. We show that regret scales as 𝑂˜(𝑑3𝑛−3/2) and prove a matching lower bound of Ω(𝑑2𝑛−3/2) . Finally, we evaluate SPEED on a set of policy evaluation tasks and demonstrate that it achieves MSE comparable to an optimal oracle and much lower than simply running the target policy.
more » « less
Full Text Available
On penalty methods for nonconvex bilevel optimization and first-order stochastic approximation

Kwon, Jeongyeol; Kwon, Dohyun; Wright, Stephen; Nowak, Robert (February 2024, ICLR 2024)

Full Text Available
On Penalty Methods for Nonconvex Bilevel Optimization and First-Order Stochastic Approximation

Kwon, Jeongyeol; Kwon, Dohyun; Wright, Stephen; Nowak, Robert D (January 2024, International Conference on Learning Representations)

In this work, we study first-order algorithms for solving Bilevel Optimization (BO) where the objective functions are smooth but possibly nonconvex in both levels and the variables are restricted to closed convex sets. As a first step, we study the landscape of BO through the lens of penalty methods, in which the upper- and lower-level objectives are combined in a weighted sum with penalty parameter . In particular, we establish a strong connection between the penalty function and the hyper-objective by explicitly characterizing the conditions under which the values and derivatives of the two must be -close. A by-product of our analysis is the explicit formula for the gradient of hyper-objective when the lower-level problem has multiple solutions under minimal conditions, which could be of independent interest. Next, viewing the penalty formulation as -approximation of the original BO, we propose first-order algorithms that find an -stationary solution by optimizing the penalty formulation with . When the perturbed lower-level problem uniformly satisfies the {\it small-error} proximal error-bound (EB) condition, we propose a first-order algorithm that converges to an -stationary point of the penalty function using in total accesses to first-order stochastic gradient oracles. Under an additional assumption on stochastic oracles, we show that the algorithm can be implemented in a fully {\it single-loop} manner, {\it i.e.,} with samples per iteration, and achieves the improved oracle-complexity of .
more » « less
Full Text Available

« Prev Next »

Search for: All records