NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Scheduling Flexible Nonpreemptive Loads in Smart-Grid Networks

https://doi.org/10.1109/TCNS.2022.3141017

Dahlin, Nathan; Jain, Rahul (March 2022, IEEE Transactions on Control of Network Systems)

Full Text Available
A Sample-Efficient Algorithm for Episodic Finite-Horizon MDP with Constraints

Krishna C. Kalagarla, Rahul Jain (January 2021, Proceedings of the AAAI Conference on Artificial Intelligence)

Full Text Available
Finite Time Guarantees for Continuous State MDPs with Generative Model

https://doi.org/10.1109/CDC42340.2020.9303840

Sharma, Hiteshi; Jain, Rahul (December 2020, 2020 59th IEEE Conference on Decision and Control (CDC))
null (Ed.)
Full Text Available
Empirical Q-Value Iteration

https://doi.org/10.1287/stsy.2019.0062

Kalathil, Dileep; Borkar, Vivek S.; Jain, Rahul (October 2020, Stochastic Systems)
null (Ed.)
We propose a new simple and natural algorithm for learning the optimal Q-value function of a discounted-cost Markov decision process (MDP) when the transition kernels are unknown. Unlike the classical learning algorithms for MDPs, such as Q-learning and actor-critic algorithms, this algorithm does not depend on a stochastic approximation-based method. We show that our algorithm, which we call the empirical Q-value iteration algorithm, converges to the optimal Q-value function. We also give a rate of convergence or a nonasymptotic sample complexity bound and show that an asynchronous (or online) version of the algorithm will also work. Preliminary experimental results suggest a faster rate of convergence to a ballpark estimate for our algorithm compared with stochastic approximation-based algorithms.
more » « less
Full Text Available
Posterior Sampling-Based Reinforcement Learning for Control of Unknown Linear Systems

https://doi.org/10.1109/TAC.2019.2950156

Ouyang, Yi; Gagrani, Mukul; Jain, Rahul (August 2020, IEEE Transactions on Automatic Control)

Full Text Available
Non-indexability of the stochastic appointment scheduling problem

https://doi.org/10.1016/j.automatica.2020.109016

Jafarnia-Jahromi, Mehdi; Jain, Rahul (August 2020, Automatica)

Full Text Available
Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes”, ICML (Int'l Conf. on Machine Learning) 2020.

Chen-Yu Wei, Mehdi Jafarnia-Jahromi (January 2020, Proceedings of Machine Learning Research)

Model-free reinforcement learning is known to be memory and computation efficient and more amendable to large scale problems. In this paper, two model-free algorithms are introduced for learning infinite-horizon average-reward Markov Decision Processes (MDPs). The first algorithm reduces the problem to the discounted-reward version and achieves O(𝑇^2/3) regret after 𝑇 steps, under the minimal assumption of weakly communicating MDPs. To our knowledge, this is the first model-free algorithm for general MDPs in this setting. The second algorithm makes use of recent advances in adaptive algorithms for adversarial multi-armed bandits and improves the regret to O(T^{1/2}), albeit with a stronger ergodic assumption. This result significantly improves over the O(T^{3/4}) regret achieved by the only existing model-free algorithm by Abbasi-Yadkori et al. (2019) for ergodic MDPs in the infinite-horizon average-reward setting.
more » « less
Full Text Available
Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation

Chen-Yu Wei, Mehdi Jafarnia (January 2020, Proceedings of Machine Learning Research)

Full Text Available
Empirical Algorithms for General Stochastic Systems with Continuous States and Actions

https://doi.org/10.1109/CDC40024.2019.9029308

Sharma, Hiteshi; Jain, Rahul; Haskell, William (December 2019, Proc. IEEE Control and Decision Conference)

Full Text Available
Approximate Relative Value Learning for Average-reward Continuous State MDPs

Sharma, Hiteshi; Jafarnia-Jahromi, Mehdi; Jain, Rahul (July 2019, Proceedings UAI)

In this paper, we propose an approximate rela- tive value learning (ARVL) algorithm for non- parametric MDPs with continuous state space and finite actions and average reward criterion. It is a sampling based algorithm combined with kernel density estimation and function approx- imation via nearest neighbors. The theoreti- cal analysis is done via a random contraction operator framework and stochastic dominance argument. This is the first such algorithm for continuous state space MDPs with average re- ward criteria with these provable properties which does not require any discretization of state space as far as we know. We then eval- uate the proposed algorithm on a benchmark problem numerically.
more » « less
Full Text Available

« Prev Next »

Search for: All records