NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Bandits with Stochastic Experts: Constant Regret, Empirical Experts and Episodes

https://doi.org/10.1145/3680279

Sharma, Nihal; Sen, Rajat; Basu, Soumya; Shanmugam, Karthikeyan; Shakkottai, Sanjay (September 2024, ACM Transactions on Modeling and Performance Evaluation of Computing Systems)

We study a variant of the contextual bandit problem where an agent can intervene through a set of stochastic expert policies. Given a fixed context, each expert samples actions from a fixed conditional distribution. The agent seeks to remain competitive with the “best” among the given set of experts. We propose the Divergence-based Upper Confidence Bound (D-UCB) algorithm that uses importance sampling to share information across experts and provide horizon-independent constant regret bounds that only scale linearly in the number of experts. We also provide the Empirical D-UCB (ED-UCB) algorithm that can function with only approximate knowledge of expert distributions. Further, we investigate the episodic setting where the agent interacts with an environment that changes over episodes. Each episode can have different context and reward distributions resulting in the best expert changing across episodes. We show that by bootstrapping from\(\mathcal {O}(N\log (NT^2\sqrt {E}))\)samples, ED-UCB guarantees a regret that scales as\(\mathcal {O}(E(N+1) + \frac{N\sqrt {E}}{T^2})\)forNexperts overEepisodes, each of lengthT. We finally empirically validate our findings through simulations.
more » « less
Full Text Available
Auto-Tuning for Cellular Scheduling Through Bandit-Learning and Low-Dimensional Clustering

https://doi.org/10.1109/TNET.2021.3077455

Tariq, Isfar; Sen, Rajat; Novlan, Thomas; Akoum, Salam; Majmundar, Milap; de Veciana, Gustavo; Shakkottai, Sanjay (October 2021, IEEE/ACM Transactions on Networking)

Full Text Available
Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions

Faw, Matthew; Sen, Rajat; Shanmugam, Karthikeyan; Caramanis, Constantine; Shakkottai, Sanjay (January 2020, 34th Conference on Neural Information Processing Systems (NeurIPS 2020))
null (Ed.)
Full Text Available
Online Channel-state Clustering And Multiuser Capacity Learning For Wireless Scheduling

https://doi.org/10.1109/INFOCOM.2019.8737425

Tariq, Isfar; Sen, Rajat; Veciana, Gustavo de; Shakkottai, Sanjay (April 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications)

Full Text Available
Blocking Bandits

Basu, Soumya; Sen, Rajat; Sanghavi, Sujay; Shakkottai, Sanjay (January 2019, 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.)

Full Text Available
Model-Powered Conditional Independence Test

Sen, Rajat; Suresh, Ananda Theertha; Shanmugam, Karthikeyan; Dimakis, Alexandros; Shakkottai, Sanjay (January 2017, Advances in neural information processing systems)

We consider the problem of non-parametric Conditional Independence testing (CI testing) for continuous random variables. Given i.i.d samples from the joint distribution f (x, y, z) of continuous random vectors X, Y and Z, we determine whether X is independent Y |Z. We approach this by converting the conditional independence test into a classification problem. This allows us to harness very powerful classifiers like gradient-boosted trees and deep neural networks. These models can handle complex probability distributions and allow us to perform significantly better compared to the prior state of the art, for high-dimensional CI testing. The main technical challenge in the classification problem is the need for samples from the conditional product distribution fCI(x,y,z) = f(x|z)f(y|z)f(z) – the joint distribution if and only if X is independent Y |Z. – when given access only to i.i.d. samples from the true joint distribution f (x, y, z). To tackle this problem we propose a novel nearest neighbor bootstrap procedure and theoretically show that our generated samples are indeed close to f^{CI} in terms of total variational distance. We then develop theoretical results regarding the generalization bounds for classification for our problem, which translate into error bounds for CI testing. We provide a novel analysis of Rademacher type classification bounds in the presence of non-i.i.d near- independent samples. We empirically validate the performance of our algorithm on simulated and real datasets and show performance gains over previous methods.
more » « less
Full Text Available

Search for: All records