skip to main content

Search for: All records

Creators/Authors contains: "Shakkottai, S"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Current wireless networks employ sophisticated multi-user transmission techniques to fully utilize the physical layer resources for data transmission. At the MAC layer, these techniques rely on a semi-static map that translates the channel quality of users to the potential transmission rate (more precisely, a map from the Channel Quality Index to the Modulation and Coding Scheme) for user selection and scheduling decisions. However, such a static map does not adapt to the actual deployment scenario and can lead to large performance losses. Furthermore, adaptively learning this map can be inefficient, particularly when there are a large number of users. In this work, we make this learning efficient by clustering users. Specifically, we develop an online learning approach that jointly clusters users and channel-states, and learns the associated rate regions of each cluster. This approach generates a scenario-specific map that replaces the static map that is currently used in practice. Furthermore, we show that our learning algorithm achieves sub- linear regret when compared to an omniscient genie. Next, we develop a user selection algorithm for multi-user scheduling using the learned user-clusters and associated rate regions. Our algorithms are validated on the WiNGS simulator from AT&T Labs, that implements the PHY/MAC stack and simulates the channel. We show that our algorithm can efficiently learn user clusters and the rate regions associated with the user sets for any observed channel state. Moreover, our simulations show that a deployment-scenario-specific map significantly outperforms the current static map approach for resource allocation at the MAC layer. 
    more » « less
  2. A major challenge in real-world reinforcement learning (RL) is the sparsity of reward feedback. Often, what is available is an intuitive but sparse reward function that only indicates whether the task is completed partially or fully. However, the lack of carefully designed, fine grain feedback implies that most existing RL algorithms fail to learn an acceptable policy in a reasonable time frame. This is because of the large number of exploration actions that the policy has to perform before it gets any useful feedback that it can learn from. In this work, we address this challenging problem by developing an algorithm that exploits the offline demonstration data generated by a sub-optimal behavior policy for faster and efficient online RL in such sparse reward settings. The proposed algorithm, which we call the Learning Online with Guidance Offline (LOGO) algorithm, merges a policy improvement step with an additional policy guidance step by using the offline demonstration data. The key idea is that by obtaining guidance from - not imitating - the offline data, LOGO orients its policy in the manner of the sub-optimal policy, while yet being able to learn beyond and approach optimality. We provide a theoretical analysis of our algorithm, and provide a lower bound on the performance improvement in each learning episode. We also extend our algorithm to the even more challenging incomplete observation setting, where the demonstration data contains only a censored version of the true state observation. We demonstrate the superior performance of our algorithm over state-of-the-art approaches on a number of benchmark environments with sparse rewards and censored state. Further, we demonstrate the value of our approach via implementing LOGO on a mobile robot for trajectory tracking and obstacle avoidance, where it shows excellent performance. 
    more » « less
  3. null (Ed.)
  4. null (Ed.)
  5. null (Ed.)