NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Should We Learn Most Likely Functions or Parameters?

Qiu, S; Rudner, T; Kapoor, S; Wilson, AG (December 2023, Advances in Neural Information Processing Systems)

Standard regularized training procedures correspond to maximizing a posterior distribution over parameters, known as maximum a posteriori (MAP) estimation. However, model parameters are of interest only insomuch as they combine with the functional form of a model to provide a function that can make good predictions. Moreover, the most likely parameters under the parameter posterior do not generally correspond to the most likely function induced by the parameter posterior. In fact, we can re-parametrize a model such that any setting of parameters can maximize the parameter posterior. As an alternative, we investigate the benefits and drawbacks of directly estimating the most likely function implied by the model and the data. We show that this procedure leads to pathological solutions when using neural networks and prove conditions under which the procedure is well-behaved, as well as a scalable approximation. Under these conditions, we find that function-space MAP estimation can lead to flatter minima, better generalization, and improved robustness to overfitting
more » « less
Full Text Available
Function-Space Regularization in Neural Networks

Rudner, T; Kapoor, S; Qiu, S; Wilson, AG (July 2023, International Conference on Machine Learning)

Parameter-space regularization in neural network optimization is a fundamental tool for improving generalization. However, standard parameter-space regularization methods make it challenging to encode explicit preferences about desired predictive functions into neural network training. In this work, we approach regularization in neural networks from a probabilistic perspective and show that by viewing parameter-space regularization as specifying an empirical prior distribution over the model parameters, we can derive a probabilistically well-motivated regularization technique that allows explicitly encoding information about desired predictive functions into neural network training. This method—which we refer to as function-space empirical Bayes (FS-EB)—includes both parameter- and function-space regularization, is mathematically simple, easy to implement, and incurs only minimal computational overhead compared to standard regularization techniques. We evaluate the utility of this regularization technique empirically and demonstrate that the proposed method leads to near-perfect semantic shift detection, highly-calibrated predictive uncertainty estimates, successful task adaption from pre-trained models, and improved generalization under covariate shift.
more » « less
Full Text Available
Volatility Based Kernels and Moving Average Means for Accurate Forecasting with Gaussian Processes

Benton, Gregory; Maddox, Wesley J.; Wilson, Andrew G (January 2022, International Conference on Machine Learning)

Full Text Available
Low-Precision Arithmetic for Fast Gaussian Processes

Wesley J. Maddox, Andres Potapczynski (January 2022, Uncertainty in Artificial Intelligence)

Full Text Available
Low-Precision Arithmetic for Fast Gaussian Processes

Maddox, Wesley J.; Potapczynski, Andres; Wilson, Andrew G. (January 2022, Uncertainty in Artificial Intelligence)

Full Text Available
Kernel Interpolation for Scalable Online Gaussian Processes

Stanton, S; Maddox, W; Delbridge, I; Wilson, A.G. (January 2021, Artificial Intelligence and Statistics (AISTATS))

Full Text Available
On the model-based stochastic value gradient for continuous reinforcement learning

Amos, B; Stanton, S; Yarats, D; Wilson, A.G. (January 2021, Learning for Dynamics and Control (L4DC))
null (Ed.)
For over a decade, model-based reinforcement learning has been seen as a way to leverage control-based domain knowledge to improve the sample-efficiency of reinforcement learning agents. While model-based agents are conceptually appealing, their policies tend to lag behind those of model-free agents in terms of final reward, especially in non-trivial environments. In response, researchers have proposed model-based agents with increasingly complex components, from ensembles of probabilistic dynamics models, to heuristics for mitigating model error. In a reversal of this trend, we show that simple model-based agents can be derived from existing ideas that not only match, but outperform state-of-the-art model-free agents in terms of both sample-efficiency and final reward. We find that a model-free soft value estimate for policy evaluation and a model-based stochastic value gradient for policy improvement is an effective combination, achieving state-of-the-art results on a high-dimensional humanoid control task, which most model-based agents are unable to solve. Our findings suggest that model-based policy evaluation deserves closer attention.
more » « less
Full Text Available
Fast Adaptation with Linearized Neural Networks

Maddox, W; Tang, S; Moreno, P; Wilson, A.G.; Damianou, A (January 2021, Artificial Intelligence and Statistics (AISTATS))

Full Text Available
BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization

Balandat, M; Karrer, B; Jiang, D; Daulton, S; Letham, B; Wilson, A.G.; Bakshy, E (January 2020, Advances in Neural Information Processing Systems (NeurIPS))

Full Text Available
Generalizing Convolutional Neural Networks for Equivarianceto Lie Groups on Arbitrary Continuous Data

Finzi, M; Stanton, S; Izmailov, P; Wilson, A.G. (January 2020, International Conference on Machine Learning)

The translation equivariance of convolutional layers enables convolutional neural networks to generalize well on image problems. While translation equivariance provides a powerful inductive bias for images, we often additionally desire equivariance to other transformations, such as rotations, especially for non-image data. We propose a general method to construct a convolutional layer that is equivariant to transformations from any specified Lie group with a surjective exponential map. Incorporating equivariance to a new group requires implementing only the group exponential and logarithm maps, enabling rapid prototyping. Showcasing the simplicity and generality of our method, we apply the same model architecture to images, ball-and-stick molecular data, and Hamiltonian dynamical systems. For Hamiltonian systems, the equivariance of our models is especially impactful, leading to exact conservation of linear and angular momentum.
more » « less
Full Text Available

« Prev Next »

Search for: All records