skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Efficient Resource Distribution by Adaptive Inter-agent Spacing in Multi-agent Systems
In multi-agent systems, limited resources must be shared by individuals during missions to maximize the group utility of the system in the field. In this paper, we present a generalized adaptive self-organization process for multi-agent systems featuring fast and efficient distribution of a consumable and refillable on-board resource throughout the group. An adaptive inter-agent spacing (AIS) controller based on individual resource levels is proposed that spaces out high resource bearing agents throughout the group including the group boundary extrema, and allows low resource bearing agents to adaptively occupy the in-between spaces receiving resource from the high resource bearing agents without over-crowding. Experimental results for cases with and without the proposed AIS controller validate faster convergence of individual resource levels to the group mean resource level using the proposed AIS controller. The generalized approach of the self-organizing process allows flexibility in adapting the proposed AIS controller for various multi-agent applications.  more » « less
Award ID(s):
1846221
PAR ID:
10132129
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
2019 IEEE International Conference on Systems, Man and Cybernetics (SMC)
Page Range / eLocation ID:
4381 - 4386
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We consider the problem of enhanced security of multi-robot systems to prevent cyber-attackers from taking control of one or more robots in the group. We build upon a recently proposed solution that utilizes the physical measurement capabilities of the robots to perform introspection, i.e., detect the malicious actions of compromised agents using other members of the group. In particular, the proposed solution finds multi-agent paths on discrete spaces combined with a set of mutual observations at specific locations to detect robots with significant deviations from the preordained routes. In this paper, we develop a planner that works on continuous configuration spaces while also taking into account similar spatio-temporal constraints. In addition, the planner allows for more general tasks that can be formulated as arbitrary smooth cost functions to be specified. The combination of constraints and objectives considered in this paper are not easily handled by popular path planning algorithms (e.g., sampling-based methods), thus we propose a method based on the Alternating Direction Method of Multipliers (ADMM). ADMM is capable of finding locally optimal solutions to problems involving different kinds of objectives and non-convex temporal and spatial constraints, and allows for infeasible initialization. We benchmark our proposed method on multi-agent map exploration with minimum-uncertainty cost function, obstacles, and observation schedule constraints. 
    more » « less
  2. With the development of sensing and communica- tion technologies in networked cyber-physical systems (CPSs), multi-agent reinforcement learning (MARL)-based methodolo- gies are integrated into the control process of physical systems and demonstrate prominent performance in a wide array of CPS domains, such as connected autonomous vehicles (CAVs). However, it remains challenging to mathematically characterize the improvement of the performance of CAVs with commu- nication and cooperation capability. When each individual autonomous vehicle is originally self-interest, we can not assume that all agents would cooperate naturally during the training process. In this work, we propose to reallocate the system’s total reward efficiently to motivate stable cooperation among autonomous vehicles. We formally define and quantify how to reallocate the system’s total reward to each agent under the proposed transferable utility game, such that communication- based cooperation among multi-agents increases the system’s total reward. We prove that Shapley value-based reward reallocation of MARL locates in the core if the transferable utility game is a convex game. Hence, the cooperation is stable and efficient and the agents should stay in the coalition or the cooperating group. We then propose a cooperative policy learning algorithm with Shapley value reward reallocation. In experiments, compared with several literature algorithms, we show the improvement of the mean episode system reward of CAV systems using our proposed algorithm. 
    more » « less
  3. In cooperative multi-agent reinforcement learning (Co-MARL), a team of agents must jointly optimize the team's longterm rewards to learn a designated task. Optimizing rewards as a team often requires inter-agent communication and data sharing, leading to potential privacy implications. We assume privacy considerations prohibit the agents from sharing their environment interaction data. Accordingly, we propose Privacy-Engineered Value Decomposition Networks (PE-VDN), a Co-MARL algorithm that models multi-agent coordination while provably safeguarding the confidentiality of the agents' environment interaction data. We integrate three privacy-engineering techniques to redesign the data flows of the VDN algorithm-an existing Co-MARL algorithm that consolidates the agents' environment interaction data to train a central controller that models multi-agent coordination-and develop PE-VDN. In the first technique, we design a distributed computation scheme that eliminates Vanilla VDN's dependency on sharing environment interaction data. Then, we utilize a privacy-preserving multi-party computation protocol to guar-antee that the data flows of the distributed computation scheme do not pose new privacy risks. Finally, we enforce differential privacy to preempt inference threats against the agents' training data-past environment interactions-when they take actions based on their neural network predictions. We implement PE-VDN in StarCraft Multi-Agent Competition (SMAC) and show that it achieves 80% of Vanilla VDN's win rate while maintaining differential privacy levels that provide meaningful privacy guarantees. The results demonstrate that PE-VDN can safeguard the confidentiality of agents' environment interaction data without sacrificing multi-agent coordination. 
    more » « less
  4. This paper studies the distributed feedback optimization problem for linear multi-agent systems without precise knowledge of local costs and agent dynamics. The proposed solution is based on a hierarchical approach that uses upper-level coordinators to adjust reference signals toward the global optimum and lower-level controllers to regulate agents’ outputs toward the reference signals. In the absence of precise information on local gradients and agent dynamics, an extremum-seeking mechanism is used to enforce a gradient descent optimization strategy, and an adaptive dynamic programming approach is taken to synthesize an internal-model-based optimal tracking controller. The whole procedure relies only on measurements of local costs and input-state data along agents’ trajectories. Moreover, under appropriate conditions, the closed-loop signals are bounded and the output of the agents exponentially converges to a small neighborhood of the desired extremum. A numerical example is conducted to validate the efficacy of the proposed method. 
    more » « less
  5. null (Ed.)
    In this paper, we present a compositional condition for ensuring safety of a collection of interacting systems modeled by inter-triggering hybrid automata (ITHA). ITHA is a modeling formalism for representing multi-agent systems in which each agent is governed by individual dynamics but can also interact with other agents through triggering actions. These triggering actions result in a jump/reset in the state of other agents according to a global resolution function. A sufficient condition for safety of the collection, inspired by responsibility-sensitive safety, is developed in two parts: self-safety relating to the individual dynamics, and responsibility relating to the triggering actions. The condition relies on having an over-approximation method for the resolution function. We further show how such over-approximations can be obtained and improved via communication. We use two examples, a job scheduling task on parallel processors and a highway driving example, throughout the paper to illustrate the concepts. Finally, we provide a comprehensive evaluation on how the proposed condition can be leveraged for several multi-agent control and supervision examples. 
    more » « less