skip to main content

Search for: All records

Creators/Authors contains: "Sastry, S. Shankar"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. How can a social planner adaptively incentivize selfish agents who are learning in a strategic environment to induce a socially optimal outcome in the long run? We propose a two-timescale learning dynamics to answer this question in games. In our learning dynamics, players adopt a class of learning rules to update their strategies at a faster timescale, while a social planner updates the incentive mechanism at a slower timescale. In particular, the update of the incentive mechanism is based on each player’s externality, which is evaluated as the difference between the player’s marginal cost and the society’s marginal cost in each time step. We show that any fixed point of our learning dynamics corresponds to the optimal incentive mechanism such that the corresponding Nash equilibrium also achieves social optimality. We also provide sufficient conditions for the learning dynamics to converge to a fixed point so that the adaptive incentive mechanism eventually induces a socially optimal outcome. Finally, as an example, we demonstrate that the sufficient conditions for convergence are satisfied in Cournot competition with finite players.
    Free, publicly-accessible full text available December 6, 2023
  2. Decentralized planning for multi-agent systems, such as fleets of robots in a search-and-rescue operation, is often constrained by limitations on how agents can communicate with each other. One such limitation is the case when agents can communicate with each other only when they are in line-of-sight (LOS). Developing decentralized planning methods that guarantee safety is difficult in this case, as agents that are occluded from each other might not be able to communicate until it’s too late to avoid a safety violation. In this paper, we develop a decentralized planning method that explicitly avoids situations where lack of visibility of other agents would lead to an unsafe situation. Building on top of an existing Rapidly exploring Random Tree (RRT)-based approach, our method guarantees safety at each iteration. Simulation studies show the effectiveness of our method and compare the degradation in performance with respect to a clairvoyant decentralized planning algorithm where agents can communicate despite not being in LOS of each other.
  3. The main drawbacks of input-output linearizing controllers are the need for precise dynamics models and not being able to account for input constraints. Model uncertainty is common in almost every robotic application and input saturation is present in every real world system. In this paper, we address both challenges for the specific case of bipedal robot control by the use of reinforcement learning techniques. Taking the structure of a standard input-output linearizing controller, we use an additive learned term that compensates for model uncertainty. Moreover, by adding constraints to the learning problem we manage to boost the performance of the final controller when input limits are present. We demonstrate the effectiveness of the designed framework for different levels of uncertainty on the five-link planar walking robot RABBIT.