skip to main content


Title: Efficient Resource Distribution by Adaptive Inter-agent Spacing in Multi-agent Systems
In multi-agent systems, limited resources must be shared by individuals during missions to maximize the group utility of the system in the field. In this paper, we present a generalized adaptive self-organization process for multi-agent systems featuring fast and efficient distribution of a consumable and refillable on-board resource throughout the group. An adaptive inter-agent spacing (AIS) controller based on individual resource levels is proposed that spaces out high resource bearing agents throughout the group including the group boundary extrema, and allows low resource bearing agents to adaptively occupy the in-between spaces receiving resource from the high resource bearing agents without over-crowding. Experimental results for cases with and without the proposed AIS controller validate faster convergence of individual resource levels to the group mean resource level using the proposed AIS controller. The generalized approach of the self-organizing process allows flexibility in adapting the proposed AIS controller for various multi-agent applications.  more » « less
Award ID(s):
1846221
NSF-PAR ID:
10132129
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
2019 IEEE International Conference on Systems, Man and Cybernetics (SMC)
Page Range / eLocation ID:
4381 - 4386
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. With the development of sensing and communica- tion technologies in networked cyber-physical systems (CPSs), multi-agent reinforcement learning (MARL)-based methodolo- gies are integrated into the control process of physical systems and demonstrate prominent performance in a wide array of CPS domains, such as connected autonomous vehicles (CAVs). However, it remains challenging to mathematically characterize the improvement of the performance of CAVs with commu- nication and cooperation capability. When each individual autonomous vehicle is originally self-interest, we can not assume that all agents would cooperate naturally during the training process. In this work, we propose to reallocate the systemโ€™s total reward efficiently to motivate stable cooperation among autonomous vehicles. We formally define and quantify how to reallocate the systemโ€™s total reward to each agent under the proposed transferable utility game, such that communication- based cooperation among multi-agents increases the systemโ€™s total reward. We prove that Shapley value-based reward reallocation of MARL locates in the core if the transferable utility game is a convex game. Hence, the cooperation is stable and efficient and the agents should stay in the coalition or the cooperating group. We then propose a cooperative policy learning algorithm with Shapley value reward reallocation. In experiments, compared with several literature algorithms, we show the improvement of the mean episode system reward of CAV systems using our proposed algorithm. 
    more » « less
  2. null (Ed.)
    We consider the problem of enhanced security of multi-robot systems to prevent cyber-attackers from taking control of one or more robots in the group. We build upon a recently proposed solution that utilizes the physical measurement capabilities of the robots to perform introspection, i.e., detect the malicious actions of compromised agents using other members of the group. In particular, the proposed solution finds multi-agent paths on discrete spaces combined with a set of mutual observations at specific locations to detect robots with significant deviations from the preordained routes. In this paper, we develop a planner that works on continuous configuration spaces while also taking into account similar spatio-temporal constraints. In addition, the planner allows for more general tasks that can be formulated as arbitrary smooth cost functions to be specified. The combination of constraints and objectives considered in this paper are not easily handled by popular path planning algorithms (e.g., sampling-based methods), thus we propose a method based on the Alternating Direction Method of Multipliers (ADMM). ADMM is capable of finding locally optimal solutions to problems involving different kinds of objectives and non-convex temporal and spatial constraints, and allows for infeasible initialization. We benchmark our proposed method on multi-agent map exploration with minimum-uncertainty cost function, obstacles, and observation schedule constraints. 
    more » « less
  3. In cooperative multi-agent reinforcement learning (Co-MARL), a team of agents must jointly optimize the team's longterm rewards to learn a designated task. Optimizing rewards as a team often requires inter-agent communication and data sharing, leading to potential privacy implications. We assume privacy considerations prohibit the agents from sharing their environment interaction data. Accordingly, we propose Privacy-Engineered Value Decomposition Networks (PE-VDN), a Co-MARL algorithm that models multi-agent coordination while provably safeguarding the confidentiality of the agents' environment interaction data. We integrate three privacy-engineering techniques to redesign the data flows of the VDN algorithm-an existing Co-MARL algorithm that consolidates the agents' environment interaction data to train a central controller that models multi-agent coordination-and develop PE-VDN. In the first technique, we design a distributed computation scheme that eliminates Vanilla VDN's dependency on sharing environment interaction data. Then, we utilize a privacy-preserving multi-party computation protocol to guar-antee that the data flows of the distributed computation scheme do not pose new privacy risks. Finally, we enforce differential privacy to preempt inference threats against the agents' training data-past environment interactions-when they take actions based on their neural network predictions. We implement PE-VDN in StarCraft Multi-Agent Competition (SMAC) and show that it achieves 80% of Vanilla VDN's win rate while maintaining differential privacy levels that provide meaningful privacy guarantees. The results demonstrate that PE-VDN can safeguard the confidentiality of agents' environment interaction data without sacrificing multi-agent coordination. 
    more » « less
  4. The Mumbai Suburban Railways, locals, are a key transit infrastructure of the city and is crucial for resuming normal economic activity. Due to high density during transit, the potential risk of disease transmission is high, and the government has taken a wait and see approach to resume normal operations. To reduce disease transmission, policymakers can enforce reduced crowding and mandate wearing of masks. Cohorting โ€“ forming groups of travelers that always travel together, is an additional policy to reduce disease transmission on locals without severe restrictions. Cohorting allows us to: (๐‘–) form traveler bubbles, thereby decreasing the number of distinct interactions over time; (๐‘–๐‘–) potentially quarantine an entire cohort if a single case is detected, making contact tracing more efficient, and (๐‘–๐‘–๐‘–) target cohorts for testing and early detection of symptomatic as well as asymptomatic cases. Studying impact of cohorts using compartmental models is challenging because of the ensuing representational complexity. Agent-based models provide a natural way to represent cohorts along with the representation of the cohort members with the larger social network. This paper describes a novel multi-scale agent-based model to study the impact of cohorting strategies on COVID-19 dynamics in Mumbai. We achieve this by modeling the Mumbai urban region using a detailed agent-based model comprising of 12.4 million agents. Individual cohorts and their inter-cohort interactions as they travel on locals are modeled using local mean field approximations. The resulting multi-scale model in conjunction with a detailed disease transmission and intervention simulator is used to assess various cohorting strategies. The results provide a quantitative trade-off between cohort size and its impact on disease dynamics and well being. The results show that cohorts can provide significant benefit in terms of reduced transmission without significantly impacting ridership and or economic & social activity. 
    more » « less
  5. null (Ed.)
    In this paper, we present a compositional condition for ensuring safety of a collection of interacting systems modeled by inter-triggering hybrid automata (ITHA). ITHA is a modeling formalism for representing multi-agent systems in which each agent is governed by individual dynamics but can also interact with other agents through triggering actions. These triggering actions result in a jump/reset in the state of other agents according to a global resolution function. A sufficient condition for safety of the collection, inspired by responsibility-sensitive safety, is developed in two parts: self-safety relating to the individual dynamics, and responsibility relating to the triggering actions. The condition relies on having an over-approximation method for the resolution function. We further show how such over-approximations can be obtained and improved via communication. We use two examples, a job scheduling task on parallel processors and a highway driving example, throughout the paper to illustrate the concepts. Finally, we provide a comprehensive evaluation on how the proposed condition can be leveraged for several multi-agent control and supervision examples. 
    more » « less