Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
This paper introduces a novel multi-armed bandits framework, termed Contextual Restless Bandits (CRB), for complex online decision-making. This CRB framework incorporates the core features of contextual bandits and restless bandits, so that it can model both the internal state transitions of each arm and the influence of external global environmental contexts. Using the dual decomposition method, we develop a scalable index policy algorithm for solving the CRB problem, and theoretically analyze the asymptotical optimality of this algorithm. In the case when the arm models are unknown, we further propose a model-based online learning algorithm based on the index policy to learn the arm models and make decisions simultaneously. Furthermore, we apply the proposed CRB framework and the index policy algorithm specifically to the demand response decision-making problem in smart grids. The numerical simulations demonstrate the performance and efficiency of our proposed CRB approaches.more » « lessFree, publicly-accessible full text available December 16, 2025
-
This paper introduces a new theoretical framework for optimizing second-order behaviors of wireless networks. Unlike existing techniques for network utility maximization, which only consider first-order statistics, this framework models every random process by its mean and temporal variance. The inclusion of temporal variance makes this framework well-suited for modeling Markovian fading wireless channels and emerging network performance metrics such as age-of-information (AoI) and timely-throughput. Using this framework, we sharply characterize the second-order capacity region of wireless access networks. We also propose a simple scheduling policy and prove that it can achieve every interior point in the second-order capacity region. To demonstrate the utility of this framework, we apply it to an unsolved network optimization problem where some clients wish to minimize AoI while others wish to maximize timely-throughput. We show that this framework accurately characterizes AoI and timely-throughput. Moreover, it leads to a tractable scheduling policy that outperforms other existing work.more » « lessFree, publicly-accessible full text available December 1, 2025
-
This paper introduces a general framework to analyze and optimize age-of-information (AoI) in CSMA protocols for distributed uplink transmissions. The proposed framework combines two theoretical approaches. First, it employs second-order analysis that characterizes all random processes by their respective means and temporal variances and approximates AoI as a function of the mean and temporal variance of the packet delivery process. Second, it employs mean-field approximation to derive the mean and temporal variance of the packet delivery process for one node in the presence of interference from others. To demonstrate the utility of this framework, this paper applies it to the age-threshold ALOHA policy and identifies parameter settings that outperform those previously suggested as optimal in the original work that introduced this policy. Simulation results demonstrate that our framework provides precise AoI approximations and achieves significantly better performance, even in networks with a small number of users.more » « less
-
This paper studies multi-stage systems with end-to-end bandit feedback. In such systems, each job needs to go through multiple stages, each managed by a different agent, before generating an outcome. Each agent can only control its own action and learn the final outcome of the job. It has neither knowledge nor control on actions taken by agents in the next stage. The goal of this paper is to develop distributed online learning algorithms that achieve sublinear regret in adversarial environments. The setting of this paper significantly expands the traditional multi-armed bandit problem, which considers only one agent and one stage. In addition to the exploration-exploitation dilemma in the traditional multi-armed bandit problem, we show that the consideration of multiple stages introduces a third component, education, where an agent needs to choose its actions to facilitate the learning of agents in the next stage. To solve this newly introduced exploration-exploitation-education trilemma, we propose a simple distributed online learning algorithm, ϵ-EXP3. We theoretically prove that the ϵ-EXP3 algorithm is a no-regret policy that achieves sublinear regret. Simulation results show that the ϵ-EXP3 algorithm significantly outperforms existing no-regret online learning algorithms for the traditional multi-armed bandit problem.more » « less
-
Scheduling in multi-channel wireless communication system presents formidable challenges in effectively allocating resources. To address these challenges, we investigate a multi-resource restless matching bandit (MR-RMB) model for heterogeneous resource systems with an objective of maximizing long-term discounted total rewards while respecting resource constraints. We have also generalized to applications beyond multi-channel wireless. We discuss the Max-Weight Index Matching algorithm, which optimizes resource allocation based on learned partial indexes. We have derived the policy gradient theorem for index learning. Our main contribution is the introduction of a new Deep Index Policy (DIP), an online learning algorithm tailored for MR-RMB. DIP learns the partial index by leveraging the policy gradient theorem for restless arms with convoluted and unknown transition kernels of heterogeneous resources. We demonstrate the utility of DIP by evaluating its performance for three different MR-RMB problems. Our simulation results show that DIP indeed learns the partial indexes efficiently.more » « less
An official website of the United States government
