In this paper, a distributed swarm control problem is studied for large-scale multi-agent systems (LS-MASs). Different than classical multi-agent systems, an LS-MAS brings new challenges to control design due to its large number of agents. It might be more difficult for developing the appropriate control to achieve complicated missions such as collective swarming. To address these challenges, a novel mixed game theory is developed with a hierarchical learning algorithm. In the mixed game, the LS-MAS is represented as a multi-group, large-scale leader–follower system. Then, a cooperative game is used to formulate the distributed swarm control for multi-group leaders, and a Stackelberg game is utilized to couple the leaders and their large-scale followers effectively. Using the interaction between leaders and followers, the mean field game is used to continue the collective swarm behavior from leaders to followers smoothly without raising the computational complexity or communication traffic. Moreover, a hierarchical learning algorithm is designed to learn the intelligent optimal distributed swarm control for multi-group leader–follower systems. Specifically, a multi-agent actor–critic algorithm is developed for obtaining the distributed optimal swarm control for multi-group leaders first. Furthermore, an actor–critic–mass method is designed to find the decentralized swarm control for large-scale followers. Eventually, a series of numerical simulations and a Lyapunov stability proof of the closed-loop system are conducted to demonstrate the performance of the developed scheme. 
                        more » 
                        « less   
                    
                            
                            Hierarchical game theoretical distributed adaptive control for large scale multi‐group multi‐agent system
                        
                    
    
            Abstract This paper introduces a distributed adaptive formation control for large‐scale multi‐agent systems (LS‐MAS) that addresses the heavy computational complexity and communication traffic challenges while directly extending conventional distributed control from small scale to large scale. Specifically, a novel hierarchical game theoretic algorithm is developed to provide a feasible theory foundation for solving LS‐MAS distributed optimal formation problem by effectively integrating the mean‐field game (MFG), the Stackelberg game, and the cooperative game. In particular, LS‐MAS is divided into multiple groups geographically with each having one group leader and a significant amount of followers. Then, a cooperative game is used among multi‐group leaders to formulate distributed inter‐group formation control for leaders. Meanwhile, an MFG is adopted for a large number of intra‐group followers to achieve the collective intra‐group formation while a Stackelberg game is connecting the followers with their corresponding leader within the same group to achieve the overall LS‐MAS multi‐group formation behavior. Moreover, a hybrid actor–critic‐based reinforcement learning algorithm is constructed to learn the solution of the hierarchical game‐based optimal distributed formation control. Finally, to show the effectiveness of the presented schemes, numerical simulations and Lyapunov analysis is performed. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2144646
- PAR ID:
- 10450733
- Publisher / Repository:
- DOI PREFIX: 10.1049
- Date Published:
- Journal Name:
- IET Control Theory & Applications
- Volume:
- 17
- Issue:
- 17
- ISSN:
- 1751-8644
- Format(s):
- Medium: X Size: p. 2332-2352
- Size(s):
- p. 2332-2352
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            The multiple-user terminals in a satellite transponder’s communication channel compete for limited radio resources to meet their own data rate needs. Because inter-user interference limits on the satellite transponder’s performance, the transponder’s power-control system needs to coordinate all its users to reduce interference and maximizes overall performance of this channel. This paper studies Stackelberg competition among the asymmetrical users in a transponder’s channel, where some users called leader have priority to choose their power control strategy, but other users called followers have to optimize their power control strategy with given leader’s controls. A Stackelberg Differential Game (SDG) is set up to model the Stackelberg competition in a transponder’s communication channel. Each user’s utility function is a trade-off between transmission data rate and power consumption. The dynamics of the system is the changing of channel gain. The optimality condition of Stackelberg equilibrium of leaders and followers is a set of Differential Algebraic Equations (DAE) with an imbedded control strategies from its counterpart. In order to solve for Stackelberg equilibrium, an algorithm based on optimizing leaders’ and followers’ Hamiltonians iteratively is developed. The numerical solution of the SDG model provides the transponder’s power control system with each user’s power-control strategy at the Stackelberg equilibrium.more » « less
- 
            We study the cooperative asynchronous multi-agent multi-armed bandits problem, where each agent's active (arm pulling) decision rounds are asynchronous. That is, in each round, only a subset of agents is active to pull arms, and this subset is unknown and time-varying. We consider two models of multi-agent cooperation, fully distributed and leader-coordinated, and propose algorithms for both models that attain near-optimal regret and communications bounds, both of which are almost as good as their synchronous counterparts. The fully distributed algorithm relies on a novel communication policy consisting of accuracy adaptive and on-demand components, and successive arm elimination for decision-making. For leader-coordinated algorithms, a single leader explores arms and recommends them to other agents (followers) to exploit. As agents' active rounds are unknown, a competent leader must be chosen dynamically. We propose a variant of the Tsallis-INF algorithm with low switches to choose such a leader sequence. Lastly, we report numerical simulations of our new asynchronous algorithms with other known baselines.more » « less
- 
            We study the problem of online learning in a two-player decentralized cooperative Stackelberg game. In each round, the leader first takes an action, followed by the follower who takes their action after observing the leader’s move. The goal of the leader is to learn to minimize the cumulative regret based on the history of interactions. Differing from the traditional formulation of repeated Stackelberg games, we assume the follower is omniscient, with full knowledge of the true reward, and that they always best-respond to the leader’s actions. We analyze the sample complexity of regret minimization in this repeated Stackelberg game. We show that depending on the reward structure, the existence of the omniscient follower may change the sample complexity drastically, from constant to exponential, even for linear cooperative Stackelberg games.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
