NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Contextual Restless Multi-Armed Bandits with Application to Demand Response Decision-Making

https://doi.org/10.1109/CDC56724.2024.10886713

Chen, Xin; Hou, I-Hong (December 2024, IEEE)

This paper introduces a novel multi-armed bandits framework, termed Contextual Restless Bandits (CRB), for complex online decision-making. This CRB framework incorporates the core features of contextual bandits and restless bandits, so that it can model both the internal state transitions of each arm and the influence of external global environmental contexts. Using the dual decomposition method, we develop a scalable index policy algorithm for solving the CRB problem, and theoretically analyze the asymptotical optimality of this algorithm. In the case when the arm models are unknown, we further propose a model-based online learning algorithm based on the index policy to learn the arm models and make decisions simultaneously. Furthermore, we apply the proposed CRB framework and the index policy algorithm specifically to the demand response decision-making problem in smart grids. The numerical simulations demonstrate the performance and efficiency of our proposed CRB approaches.
more » « less
Full Text Available
AoI, Timely-Throughput, and Beyond: A Theory of Second-Order Wireless Network Optimization

https://doi.org/10.1109/TNET.2024.3432655

Guo, Daojing; Nakhleh, Khaled; Hou, I-Hong; Kompella, Sastry; Kam, Clement (December 2024, IEEE/ACM Transactions on Networking)

This paper introduces a new theoretical framework for optimizing second-order behaviors of wireless networks. Unlike existing techniques for network utility maximization, which only consider first-order statistics, this framework models every random process by its mean and temporal variance. The inclusion of temporal variance makes this framework well-suited for modeling Markovian fading wireless channels and emerging network performance metrics such as age-of-information (AoI) and timely-throughput. Using this framework, we sharply characterize the second-order capacity region of wireless access networks. We also propose a simple scheduling policy and prove that it can achieve every interior point in the second-order capacity region. To demonstrate the utility of this framework, we apply it to an unsolved network optimization problem where some clients wish to minimize AoI while others wish to maximize timely-throughput. We show that this framework accurately characterizes AoI and timely-throughput. Moreover, it leads to a tractable scheduling policy that outperforms other existing work.
more » « less
Full Text Available
Second-Order Analysis of CSMA Protocols for Age-of-Information Minimization

https://doi.org/10.1109/IEEECONF60004.2024.10943050

Fan, Siqi; Hou, I-Hong (October 2024, IEEE)

This paper introduces a general framework to analyze and optimize age-of-information (AoI) in CSMA protocols for distributed uplink transmissions. The proposed framework combines two theoretical approaches. First, it employs second-order analysis that characterizes all random processes by their respective means and temporal variances and approximates AoI as a function of the mean and temporal variance of the packet delivery process. Second, it employs mean-field approximation to derive the mean and temporal variance of the packet delivery process for one node in the presence of interference from others. To demonstrate the utility of this framework, this paper applies it to the age-threshold ALOHA policy and identifies parameter settings that outperform those previously suggested as optimal in the original work that introduced this policy. Simulation results demonstrate that our framework provides precise AoI approximations and achieves significantly better performance, even in networks with a small number of users.
more » « less
Full Text Available
Distributed No-Regret Learning for Multi-Stage Systems with End-to-End Bandit Feedback

https://doi.org/10.1145/3641512.3686369

Hou, I-Hong (October 2024, ACM)

This paper studies multi-stage systems with end-to-end bandit feedback. In such systems, each job needs to go through multiple stages, each managed by a different agent, before generating an outcome. Each agent can only control its own action and learn the final outcome of the job. It has neither knowledge nor control on actions taken by agents in the next stage. The goal of this paper is to develop distributed online learning algorithms that achieve sublinear regret in adversarial environments. The setting of this paper significantly expands the traditional multi-armed bandit problem, which considers only one agent and one stage. In addition to the exploration-exploitation dilemma in the traditional multi-armed bandit problem, we show that the consideration of multiple stages introduces a third component, education, where an agent needs to choose its actions to facilitate the learning of agents in the next stage. To solve this newly introduced exploration-exploitation-education trilemma, we propose a simple distributed online learning algorithm, ϵ-EXP3. We theoretically prove that the ϵ-EXP3 algorithm is a no-regret policy that achieves sublinear regret. Simulation results show that the ϵ-EXP3 algorithm significantly outperforms existing no-regret online learning algorithms for the traditional multi-armed bandit problem.
more » « less
Full Text Available
Deep Index Policy for Multi-Resource Restless Matching Bandit and Its Application in Multi-Channel Scheduling

https://doi.org/10.1145/3641512.3686381

Zamir, Nida; Hou, I-Hong (October 2024, ACM)

Scheduling in multi-channel wireless communication system presents formidable challenges in effectively allocating resources. To address these challenges, we investigate a multi-resource restless matching bandit (MR-RMB) model for heterogeneous resource systems with an objective of maximizing long-term discounted total rewards while respecting resource constraints. We have also generalized to applications beyond multi-channel wireless. We discuss the Max-Weight Index Matching algorithm, which optimizes resource allocation based on learned partial indexes. We have derived the policy gradient theorem for index learning. Our main contribution is the introduction of a new Deep Index Policy (DIP), an online learning algorithm tailored for MR-RMB. DIP learns the partial index by leveraging the policy gradient theorem for restless arms with convoluted and unknown transition kernels of heterogeneous resources. We demonstrate the utility of DIP by evaluating its performance for three different MR-RMB problems. Our simulation results show that DIP indeed learns the partial indexes efficiently.
more » « less
Full Text Available
An mm-Wave CMOS/Si-Photonics Reconfigurable Hybrid-Integrated Heterodyning Software-Defined Radio Receiver

https://doi.org/10.1109/TMTT.2024.3371914

Rady, Ramy; Luo, Yu-Lun; Madsen, Christi; Palermo, Samuel; Entesari, Kamran (May 2024, IEEE Transactions on Microwave Theory and Techniques)

Full Text Available
A 16-32GHz RF Silicon Photonic Receiver with 22nm FD-SOI CMOS Driver

https://doi.org/10.1109/IPC57732.2023.10360718

Luo, Yu-Lun; Paladugu, Dharma; Rady, Ramy; Entesari, Kamran; Palermo, Samuel (November 2023, IEEE)

Full Text Available
A mm-wave CMOS/Si-Photonics Hybrid-Integrated Software-Defined Radio Receiver Achieving> 80-dB Blocker Rejection of < −10 dBm In-Band Blockers

https://doi.org/10.1109/RFIC54547.2023.10186186

Rady, Ramy; Luo, Yu-Lun; Madsen, Christi; Palermo, Samuel; Entesari, Kamran (June 2023, IEEE)

Full Text Available

Search for: All records