skip to main content

Title: Hierarchical game theoretical distributed adaptive control for large scale multi‐group multi‐agent system

This paper introduces a distributed adaptive formation control for large‐scale multi‐agent systems (LS‐MAS) that addresses the heavy computational complexity and communication traffic challenges while directly extending conventional distributed control from small scale to large scale. Specifically, a novel hierarchical game theoretic algorithm is developed to provide a feasible theory foundation for solving LS‐MAS distributed optimal formation problem by effectively integrating the mean‐field game (MFG), the Stackelberg game, and the cooperative game. In particular, LS‐MAS is divided into multiple groups geographically with each having one group leader and a significant amount of followers. Then, a cooperative game is used among multi‐group leaders to formulate distributed inter‐group formation control for leaders. Meanwhile, an MFG is adopted for a large number of intra‐group followers to achieve the collective intra‐group formation while a Stackelberg game is connecting the followers with their corresponding leader within the same group to achieve the overall LS‐MAS multi‐group formation behavior. Moreover, a hybrid actor–critic‐based reinforcement learning algorithm is constructed to learn the solution of the hierarchical game‐based optimal distributed formation control. Finally, to show the effectiveness of the presented schemes, numerical simulations and Lyapunov analysis is performed.

more » « less
Award ID(s):
Author(s) / Creator(s):
Publisher / Repository:
DOI PREFIX: 10.1049
Date Published:
Journal Name:
IET Control Theory & Applications
Medium: X Size: p. 2332-2352
["p. 2332-2352"]
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper, a distributed swarm control problem is studied for large-scale multi-agent systems (LS-MASs). Different than classical multi-agent systems, an LS-MAS brings new challenges to control design due to its large number of agents. It might be more difficult for developing the appropriate control to achieve complicated missions such as collective swarming. To address these challenges, a novel mixed game theory is developed with a hierarchical learning algorithm. In the mixed game, the LS-MAS is represented as a multi-group, large-scale leader–follower system. Then, a cooperative game is used to formulate the distributed swarm control for multi-group leaders, and a Stackelberg game is utilized to couple the leaders and their large-scale followers effectively. Using the interaction between leaders and followers, the mean field game is used to continue the collective swarm behavior from leaders to followers smoothly without raising the computational complexity or communication traffic. Moreover, a hierarchical learning algorithm is designed to learn the intelligent optimal distributed swarm control for multi-group leader–follower systems. Specifically, a multi-agent actor–critic algorithm is developed for obtaining the distributed optimal swarm control for multi-group leaders first. Furthermore, an actor–critic–mass method is designed to find the decentralized swarm control for large-scale followers. Eventually, a series of numerical simulations and a Lyapunov stability proof of the closed-loop system are conducted to demonstrate the performance of the developed scheme. 
    more » « less
  2. Emerging on-demand service platforms (OSPs) have recently embraced teamwork as a strategy for stimulating workers’ productivity and mediating temporal supply and demand imbalances. This research investigates the team contest scheme design problem considering work schedules. Introducing teams on OSPs creates a hierarchical single-leader multi-follower game. The leader (platform) establishes rewards and intrateam revenue-sharing rules for distributing workers’ payoffs. Each follower (team) competes with others by coordinating the schedules of its team members to maximize the total expected utility. The concurrence of interteam competition and intrateam coordination causes dual effects, which are captured by an equilibrium analysis of the followers’ game. To align the platform’s interest with workers’ heterogeneous working-time preferences, we propose a profit-maximizing contest scheme consisting of a winner’s reward and time-varying payments. A novel algorithm that combines Bayesian optimization, duality, and a penalty method solves the optimal scheme in the nonconvex equilibrium-constrained problem. Our results indicate that teamwork is a useful strategy with limitations. Under the proposed scheme, team contest always benefits workers. Intrateam coordination helps teams strategically mitigate the negative externalities caused by overcompetition among workers. For the platform, the optimal scheme can direct teams’ schedules toward more profitable market equilibria when workers have inaccurate perceptions of the market. History: This paper has been accepted for the Service Science Special Issue on Innovation in Transportation-Enabled Urban Services. Funding: This work was supported by the National Science Foundation [Grant FW-HTF-P 2222806]. Supplemental Material: The online appendices are available at . 
    more » « less
  3. Abstract

    Some human groups are organized hierarchically and some are distributed. Both types of groups occur in economic, political, and military domains, but it is unclear why hierarchical organizations are favored in certain contexts and distributed organizations are favored in others. I propose that these different organizational structures can be explained by human groups having different constraints on their ability to foster cooperation within the group. Human within-group cooperation is often maintained by monitoring and punishment. In hierarchical groups, monitoring and punishment are organized into tree-like command-and-control structures with supervisors responsible for monitoring the cooperation of their subordinates and punishing non-cooperators. By contrast, in distributed groups, monitoring is diffuse and punishment is collective. I propose that the organization of cooperative human groups is constrained by the costs of monitoring and punishment. I formalize this hypothesis with a model where individuals in a group cooperate to produce public goods while embedded in a network of monitoring and punishment responsibilities. I show that, when punishment costs are high and monitoring costs are low, socially-optimal monitoring and punishment networks are distributed. The size of these distributed networks is constrained by monitoring costs. However, when punishment costs are low, socially-optimal networks are hierarchical. Monitoring costs do not constrain the size of hierarchical networks but determine how many levels of supervision are required to foster cooperation in the hierarchical group. These results may explain the increasingly large and hierarchical groups throughout much of human history. They also suggest that the recent emergence of large-scale distributed organizations has been possible because new technologies, like the internet, have made monitoring costs extremely low.

    more » « less
  4. Large‐area, long‐duration power outages are increasingly common in the United States, and cost the economy billions of dollars each year. Building a strategy to enhance grid resilience requires an understanding of the optimal mix of preventive and corrective actions, the inefficiencies that arise when self‐interested parties make resilience investment decisions, and the conditions under which regulators may facilitate the realization of efficient market outcomes. We develop a bi‐level model to examine the mix of preventive and corrective measures that enhances grid resilience to a severe storm. The model represents a Stackelberg game between a regulated utility (leader) that may harden distribution feeders before a long‐duration outage and/or deploy restoration crews after the disruption, and utility customers with varying preferences for reliable power (followers) who may invest in backup generators. We show that the regulator's denial of cost recovery for the utility's preventive expenditures, coupled with the misalignment between private objectives and social welfare maximization, yields significant inefficiencies in the resilience investment mix. Allowing cost recovery for a higher share of the utility's capital expenditures in preventive measures, extending the time horizon associated with damage cost recovery, and adopting a storm restoration compensation mechanism shift the realized market outcome toward the efficient solution. If about one‐fifth of preventive resilience investments is approved by regulators, requiring utilities to pay a compensation of $365 per customer for a 3‐day outage (about seven times the level of compensation currently offered by US utilities) provides significant incentives toward more efficient preventive resilience investments. 
    more » « less
  5. Electricity markets are cleared by a two-stage, sequential process consisting of a forward (day-ahead) market and a spot (real-time) market. While their design goal is to achieve efficiency, the lack of sufficient competition introduces many opportunities for price manipulation. To discourage this phenomenon, some Independent System Operators (ISOs) mandate generators to submit (approximately) truthful bids in the day-ahead market. However, without fully accounting for all participants' incentives (generators and loads), the application of such a mandate may lead to unintended consequences. In this paper, we model and study the interactions of generators and inelastic loads in a two-stage settlement where generators are required to bid truthfully in the day-ahead market. We show that such mandate, when accounting for generator and load incentives, leads to a {generalized} Stackelberg-Nash game where load decisions (leaders) are performed in day-ahead market and generator decisions (followers) are relegated to the real-time market. Furthermore, the use of conventional supply function bidding for generators in real-time, does not guarantee the existence of a Nash equilibrium. This motivates the use of intercept bidding, as an alternative bidding mechanism for generators in the real-time market. An equilibrium analysis in this setting, leads to a closed-form solution that unveils several insights. Particularly, it shows that, unlike standard two-stage markets, loads are the winners of the competition in the sense that their aggregate payments are less than that of the competitive equilibrium. Moreover, heterogeneity in generators cost has the unintended effect of mitigating loads market power. Numerical studies validate and further illustrate these insights. 
    more » « less