Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

Salakhutdinov, Ruslan ; Kolter, Zico ; Heller, Katherine ; Weller, Adrian ; Oliver, Nuria ; Scarlett, Jonathan ; Berkenkamp, Felix (Ed.)To mitigate the limitation that the classical reinforcement learning (RL) framework heavily relies on identical training and test environments, Distributionally Robust RL (DRRL) has been proposed to enhance performance across a range of environments, possibly including unknown test environments. As a price for robustness gain, DRRL involves optimizing over a set of distributions, which is inherently more challenging than optimizing over a fixed distribution in the nonrobust case. Existing DRRL algorithms are either modelbased or fail to learn from a single sample trajectory. In this paper, we design a first fully modelfree DRRL algorithm, called distributionally robust Qlearning with single trajectory (DRQ). We delicately design a multitimescale framework to fully utilize each incrementally arriving sample and directly learn the optimal distributionally robust policy without modeling the environment, thus the algorithm can be trained along a single trajectory in a modelfree fashion. Despite the algorithm’s complexity, we provide asymptotic convergence guarantees by generalizing classical stochastic approximation tools. Comprehensive experimental results demonstrate the superior robustness and sample complexity of our proposed algorithm, compared to nonrobust methods and other robust RL algorithms.more » « lessFree, publiclyaccessible full text available August 1, 2025

Salakhutdinov, Ruslan ; Kolter, Zico ; Heller, Katherine ; Weller, Adrian ; Oliver, Nuria ; Scarlett, Jonathan ; Berkenkamp, Felix (Ed.)Replica exchange stochastic gradient Langevin dynamics (reSGLD) is an effective sampler for nonconvex learning in largescale datasets. However, the simulation may encounter stagnation issues when the hightemperature chain delves too deeply into the distribution tails. To tackle this issue, we propose reflected reSGLD (r2SGLD): an algorithm tailored for constrained nonconvex exploration by utilizing reflection steps within a bounded domain. Theoretically, we observe that reducing the diameter of the domain enhances mixing rates, exhibiting a quadratic behavior. Empirically, we test its performance through extensive experiments, including identifying dynamical systems with physical constraints, simulations of constrained multimodal distributions, and image classification tasks. The theoretical and empirical findings highlight the crucial role of constrained exploration in improving the simulation efficiency.more » « lessFree, publiclyaccessible full text available July 21, 2025

Salakhutdinov, Ruslan ; Kolter, Zico ; Heller, Katherine ; Weller, Adrian ; Oliver, Nuria ; Scarlett, Jonathan ; Berkenkamp, Felix (Ed.)To mitigate the limitation that the classical reinforcement learning (RL) framework heavily relies on identical training and test environments, Distributionally Robust RL (DRRL) has been proposed to enhance performance across a range of environments, possibly including unknown test environments. As a price for robustness gain, DRRL involves optimizing over a set of distributions, which is inherently more challenging than optimizing over a fixed distribution in the nonrobust case. Existing DRRL algorithms are either modelbased or fail to learn from a single sample trajectory. In this paper, we design a first fully modelfree DRRL algorithm, called distributionally robust Qlearning with single trajectory (DRQ). We delicately design a multitimescale framework to fully utilize each incrementally arriving sample and directly learn the optimal distributionally robust policy without modeling the environment, thus the algorithm can be trained along a single trajectory in a modelfree fashion. Despite the algorithm's complexity, we provide asymptotic convergence guarantees by generalizing classical stochastic approximation tools. Comprehensive experimental results demonstrate the superior robustness and sample complexity of our proposed algorithm, compared to nonrobust methods and other robust RL algorithms.more » « lessFree, publiclyaccessible full text available May 1, 2025

Salakhutdinov, Ruslan ; Kolter, Zico ; Heller, Katherine ; Weller, Adrian ; Oliver, Nuria ; Scarlett, Jonathan ; Berkenkamp, Felix (Ed.)Rankings are ubiquitous across many applications, from search engines to hiring committees. In practice, many rankings are derived from the output of predictors. However, when predictors trained for classification tasks have intrinsic uncertainty, it is not obvious how this uncertainty should be represented in the derived rankings. Our work considers ranking functions: maps from individual predictions for a classification task to distributions over rankings. We focus on two aspects of ranking functions: stability to perturbations in predictions and fairness towards both individuals and subgroups. Not only is stability an important requirement for its own sake, but — as we show — it composes harmoniously with individual fairness in the sense of Dwork et al. (2012). While deterministic ranking functions cannot be stable aside from trivial scenarios, we show that the recently proposed uncertainty aware (UA) ranking functions of Singh et al. (2021) are stable. Our main result is that UA rankings also achieve group fairness through successful composition with multiaccurate or multicalibrated predictors. Our work demonstrates that UA rankings naturally interpolate between group and individual level fairness guarantees, while simultaneously satisfying stability guarantees important whenever machinelearned predictions are used.more » « lessFree, publiclyaccessible full text available February 14, 2025

Krause, Andreas ; Brunskill, Emma ; Cho, Kyunghyun ; Engelhardt, Barbara ; Sabato, Sivan ; Scarlett, Jonathan (Ed.)Transfer operators provide a rich framework for representing the dynamics of very general, nonlinear dynamical systems. When interacting with reproducing kernel Hilbert spaces (RKHS), descriptions of dynamics often incur prohibitive data storage requirements, motivating dataset sparsification as a precursory step to computation. Further, in practice, data is available in the form of trajectories, introducing correlation between samples. In this work, we present a method for sparse learning of transfer operators from $\beta$mixing stochastic processes, in both discrete and continuous time, and provide sample complexity analysis extending existing theoretical guarantees for learning from nonsparse, i.i.d. data. In addressing continuoustime settings, we develop precise descriptions using covariancetype operators for the infinitesimal generator that aids in the sample complexity analysis. We empirically illustrate the efficacy of our sparse embedding approach through deterministic and stochastic nonlinear system examples.more » « less

Krause, Andreas ; Brunskill, Emma ; Cho, Kyunghyun ; Engelhardt, Barbara ; Sabato, Sivan ; Scarlett, Jonathan (Ed.)Differentiable Search Index is a recently proposed paradigm for document retrieval, that encodes information about a corpus of documents within the parameters of a neural network and directly maps queries to corresponding documents. These models have achieved stateoftheart performances for document retrieval across many benchmarks. These kinds of models have a significant limitation: it is not easy to add new documents after a model is trained. We propose IncDSI, a method to add documents in real time (about 2050ms per document), without retraining the model on the entire dataset (or even parts thereof). Instead we formulate the addition of documents as a constrained optimization problem that makes minimal changes to the network parameters. Although orders of magnitude faster, our approach is competitive with retraining the model on the whole dataset and enables the development of document retrieval systems that can be updated with new information in realtime. Our code for IncDSI is available at \href{https://github.com/varshakishore/IncDSI}{https://github.com/varshakishore/IncDSI}.more » « less

Krause, Andreas ; Brunskill, Emma ; Cho, Kyunghyun ; Engelhardt, Barbara ; Sabato, Sivan ; Scarlett, Jonathan (Ed.)We consider the problem of estimating the optimal transport map between two probability distributions, P and Q in R^d, on the basis of i.i.d. samples. All existing statistical analyses of this problem require the assumption that the transport map is Lipschitz, a strong requirement that, in particular, excludes any examples where the transport map is discontinuous. As a first step towards developing estimation procedures for discontinuous maps, we consider the important special case where the data distribution Q is a discrete measure supported on a finite number of points in R^d. We study a computationally efficient estimator initially proposed by Pooladian & NilesWeed (2021), based on entropic optimal transport, and show in the semidiscrete setting that it converges at the minimaxoptimal rate n^{−1/2}, independent of dimension. Other standard map estimation techniques both lack finitesample guarantees in this setting and provably suffer from the curse of dimensionality. We confirm these results in numerical experiments, and provide experiments for other settings, not covered by our theory, which indicate that the entropic estimator is a promising methodology for other discontinuous transport map estimation problems.more » « less

Krause, Andreas ; Brunskill, Emma ; Cho, Kyunghyun ; Engelhardt, Barbara ; Sabato, Sivan ; Scarlett, Jonathan. (Ed.)The parameter space for any fixed architecture of feedforward ReLU neural networks serves as a proxy during training for the associated class of functions  but how faithful is this representation? It is known that many different parameter settings $\theta$ can determine the same function $f$. Moreover, the degree of this redundancy is inhomogeneous: for some networks, the only symmetries are permutation of neurons in a layer and positive scaling of parameters at a neuron, while other networks admit additional hidden symmetries. In this work, we prove that, for any network architecture where no layer is narrower than the input, there exist parameter settings with no hidden symmetries. We also describe a number of mechanisms through which hidden symmetries can arise, and empirically approximate the functional dimension of different network architectures at initialization. These experiments indicate that the probability that a network has no hidden symmetries decreases towards 0 as depth increases, while increasing towards 1 as width and input dimension increase.more » « less

Krause, Andreas ; Brunskill, Emma_Patricia ; Cho, Kyunghyun ; Engelhardt, Barbara_Elizabeth ; Sabato, Sivan ; Scarlett, Jonathan (Ed.)Realworld data can be multimodal distributed, e.g., data describing the opinion divergence in a community, the interspike interval distribution of neurons, and the oscillators' natural frequencies. Generating multimodal distributed realworld data has become a challenge to existing generative adversarial networks (GANs). For example, it is often observed that Neural SDEs have only demonstrated successful performance mainly in generating unimodal time series datasets. In this paper, we propose a novel time series generator, named directed chain GANs (DCGANs), which inserts a time series dataset (called a neighborhood process of the directed chain or input) into the drift and diffusion coefficients of the directed chain SDEs with distributional constraints. DCGANs can generate new time series of the same distribution as the neighborhood process, and the neighborhood process will provide the key step in learning and generating multimodal distributed time series. The proposed DCGANs are examined on four datasets, including two stochastic models from social sciences and computational neuroscience, and two realworld datasets on stock prices and energy consumption. To our best knowledge, DCGANs are the first work that can generate multimodal time series data and consistently outperforms stateoftheart benchmarks with respect to measures of distribution, data similarity, and predictive ability.more » « less

Krause, Andreas ; Brunskill, Emma ; Cho, Kyunghyun ; Engelhardt ; Barbara ; Sabato, Sivan ; Scarlett, Jonathan (Ed.)Robust Markov decision processes (MDPs) address the challenge of model uncertainty by optimizing the worstcase performance over an uncertainty set of MDPs. In this paper, we focus on the robust averagereward MDPs under the modelfree setting. We first theoretically characterize the structure of solutions to the robust averagereward Bellman equation, which is essential for our later convergence analysis. We then design two modelfree algorithms, robust relative value iteration (RVI) TD and robust RVI Qlearning, and theoretically prove their convergence to the optimal solution. We provide several widely used uncertainty sets as examples, including those def ined by the contamination model, total variation, Chisquared divergence, KullbackLeibler (KL) divergence and Wasserstein distance.more » « less