Search for: All records

Creators/Authors contains: "Shakkottai, Srinivas"

« Prev Next »

Total Resources

27

Resource Type
Conference Paper

15

Conference Proceeding

2

Dataset

0

Journal Article

10

Workshop Report

0

Availability
Full Text / Resource Available

27

Citation Only

0

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

OpenGridGym: An Open-Source AI-Friendly Toolkit for Distribution Market Simulation

https://doi.org/10.1109/TSG.2022.3213240

El Helou, Rayan ; Lee, Kiyeob ; Wu, Dongqi ; Xie, Le ; Shakkottai, Srinivas ; Subramanian, Vijay ( March 2023 , IEEE Transactions on Smart Grid)

Full Text Available
Multi-Agent Learning via Markov Potential Games in Marketplaces for Distributed Energy Resources

https://doi.org/10.1109/CDC51059.2022.9992762

Narasimha, Dheeraj ; Lee, Kiyeob ; Kalathil, Dileep ; Shakkottai, Srinivas ( December 2022 , 2022 IEEE 61st Conference on Decision and Control (CDC))

Full Text Available
Enhanced Meta Reinforcement Learning via Demonstrations in Sparse Reward Environments

Rengarajan, Desik ; Chaudhary, Sapana ; Kim, Jaewon ; Kalathil, Dileep ; Shakkottai, Srinivas ( December 2022 , Advances in neural information processing systems)

Full Text Available
Enhanced Meta Reinforcement Learning using Demonstrations in Sparse Reward Environments

Rengarajan, Desik ; Chaudhary, Sapana ; Jaewon, Kim ; Kalathil, Dileep ; Shakkottai, Srinivas ( November 2022 , Advances in neural information processing systems)

Meta reinforcement learning (Meta-RL) is an approach wherein the experience gained from solving a variety of tasks is distilled into a meta-policy. The metapolicy, when adapted over only a small (or just a single) number of steps, is able to perform near-optimally on a new, related task. However, a major challenge to adopting this approach to solve real-world problems is that they are often associated with sparse reward functions that only indicate whether a task is completed partially or fully. We consider the situation where some data, possibly generated by a suboptimal agent, is available for each task. We then develop a class of algorithms entitled Enhanced Meta-RL using Demonstrations (EMRLD) that exploit this information—even if sub-optimal—to obtain guidance during training. We show how EMRLD jointly utilizes RL and supervised learning over the offline data to generate a meta-policy that demonstrates monotone performance improvements. We also develop a warm started variant called EMRLD-WS that is particularly efficient for sub-optimal demonstration data. Finally, we show that our EMRLD algorithms significantly outperform existing approaches in a variety of sparse reward environments, including that of a mobile robot.
more » « less
Full Text Available
DOPE: Doubly Optimistic and Pessimistic Exploration for Safe Reinforcement Learning

Bura, Archana ; HasanzadeZonuzy, Aria ; Kalathil, Dileep ; Shakkottai, Srinivas ; Chamberland, Jean-Francois ( December 2022 , Advances in neural information processing systems)

Full Text Available
DOPE: Doubly Optimistic and Pessimistic Exploration for Safe Reinforcement Learning

Bura, Archana ; HasanzadeZonuzy, Aria ; Kalathil, Dileep ; Shakkottai, Srinivas ; Chamberland, Jean-Francois ( November 2022 , Advances in neural information processing systems)

Safe reinforcement learning is extremely challenging--not only must the agent explore an unknown environment, it must do so while ensuring no safety constraint violations. We formulate this safe reinforcement learning (RL) problem using the framework of a finite-horizon Constrained Markov Decision Process (CMDP) with an unknown transition probability function, where we model the safety requirements as constraints on the expected cumulative costs that must be satisfied during all episodes of learning. We propose a model-based safe RL algorithm that we call Doubly Optimistic and Pessimistic Exploration (DOPE), and show that it achieves an objective regret $\tilde{O}(|\mathcal{S}|\sqrt{|\mathcal{A}| K})$ without violating the safety constraints during learning, where $|\mathcal{S}|$ is the number of states, $|\mathcal{A}|$ is the number of actions, and $K$ is the number of learning episodes. Our key idea is to combine a reward bonus for exploration (optimism) with a conservative constraint (pessimism), in addition to the standard optimistic model-based exploration. DOPE is not only able to improve the objective regret bound, but also shows a significant empirical performance improvement as compared to earlier optimism-pessimism approaches.
more » « less
Full Text Available
Energy system digitization in the era of AI: A three-layered approach toward carbon neutrality

https://doi.org/10.1016/j.patter.2022.100640

Xie, Le ; Huang, Tong ; Zheng, Xiangtian ; Liu, Yan ; Wang, Mengdi ; Vittal, Vijay ; Kumar, P.R. ; Shakkottai, Srinivas ; Cui, Yi ( December 2022 , Patterns)

Full Text Available
Targeted demand response for mitigating price volatility and enhancing grid reliability in synthetic Texas electricity markets

https://doi.org/10.1016/j.isci.2021.103723

Lee, Kiyeob ; Geng, Xinbo ; Sivaranjani, S. ; Xia, Bainan ; Ming, Hao ; Shakkottai, Srinivas ; Xie, Le ( February 2022 , iScience)

Full Text Available
Model-Based Reinforcement Learning for Infinite-Horizon Discounted Constrained Markov Decision Processes

https://doi.org/10.24963/ijcai.2021/347

HasanzadeZonuzy, Aria ; Kalathil, Dileep ; Shakkottai, Srinivas ( August 2021 , IJCAI 2021)

In many real-world reinforcement learning (RL) problems, in addition to maximizing the objective, the learning agent has to maintain some necessary safety constraints. We formulate the problem of learning a safe policy as an infinite-horizon discounted Constrained Markov Decision Process (CMDP) with an unknown transition probability matrix, where the safety requirements are modeled as constraints on expected cumulative costs. We propose two model-based constrained reinforcement learning (CRL) algorithms for learning a safe policy, namely, (i) GM-CRL algorithm, where the algorithm has access to a generative model, and (ii) UC-CRL algorithm, where the algorithm learns the model using an upper confidence style online exploration method. We characterize the sample complexity of these algorithms, i.e., the the number of samples needed to ensure a desired level of accuracy with high probability, both with respect to objective maximization and constraint satisfaction.

more » « less
Full Text Available
Model-Based Reinforcement Learning for Infinite-Horizon Discounted Constrained Markov Decision Processes

HasanzadeZonuzy, Aria ; Kalathil, Dileep ; Shakkottai, Srinivas ( August 2021 , International Joint Conference on Artificial Intelligence (IJCAI))

In many real-world reinforcement learning (RL) problems, in addition to maximizing the objective, the learning agent has to maintain some necessary safety constraints. We formulate the problem of learning a safe policy as an infinite-horizon discounted Constrained Markov Decision Process (CMDP) with an unknown transition probability matrix, where the safety requirements are modeled as constraints on expected cumulative costs. We propose two model-based constrained reinforcement learning (CRL) algorithms for learning a safe policy, namely, (i) GM-CRL algorithm, where the algorithm has access to a generative model, and (ii) UC-CRL algorithm, where the algorithm learns the model using an upper confidence style online exploration method. We characterize the sample complexity of these algorithms, i.e., the the number of samples needed to ensure a desired level of accuracy with high probability, both with respect to objective maximization and constraint satisfaction.
more » « less
Full Text Available

« Prev Next »