Contextual Restless Multi-Armed Bandits with Application to Demand Response Decision-Making

Chen, Xin; Hou, I-Hong

doi:10.1109/CDC56724.2024.10886713

Citation Details

Contextual Restless Multi-Armed Bandits with Application to Demand Response Decision-Making

This paper introduces a novel multi-armed bandits framework, termed Contextual Restless Bandits (CRB), for complex online decision-making. This CRB framework incorporates the core features of contextual bandits and restless bandits, so that it can model both the internal state transitions of each arm and the influence of external global environmental contexts. Using the dual decomposition method, we develop a scalable index policy algorithm for solving the CRB problem, and theoretically analyze the asymptotical optimality of this algorithm. In the case when the arm models are unknown, we further propose a model-based online learning algorithm based on the index policy to learn the arm models and make decisions simultaneously. Furthermore, we apply the proposed CRB framework and the index policy algorithm specifically to the demand response decision-making problem in smart grids. The numerical simulations demonstrate the performance and efficiency of our proposed CRB approaches. more »

Award ID(s):: 2332800 2127721

PAR ID:: 10600089

Author(s) / Creator(s):: Chen, Xin; Hou, I-Hong

Publisher / Repository:: IEEE

Date Published:: 2024-12-16

ISSN:: 2576-2370

ISBN:: 979-8-3503-1633-9

Page Range / eLocation ID:: 2652 to 2657

Format(s):: Medium: X

Location:: Milan, Italy

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/CDC56724.2024.10886713

More Like this