skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Cost-Aware Multi-Agent System for Black-Box Design Space Exploration
Effective coordination of design teams must account for the influence of costs incurred while searching for the best design solutions. This article introduces a cost-aware multi-agent system (MAS), a theoretical model to (1) explain how individuals in a team should search, assuming that they are all rational utility-maximizing decision-makers and (2) study the impact of cost on the search performance of both individual agents and the system. First, we develop a new multi-agent Bayesian optimization framework accounting for information exchange among agents to support their decisions on where to sample in search. Second, we employ a reinforcement learning approach based on the multi-agent deep deterministic policy gradient for training MAS to identify where agents cannot sample due to design constraints. Third, we propose a new cost-aware stopping criterion for each agent to determine when costs outweigh potential gains in search as a criterion to stop. Our results indicate that cost has a more significant impact on MAS communication in complex design problems than in simple ones. For example, when searching in complex design spaces, some agents could initially have low-performance gains, thus stopping prematurely due to negative payoffs, even if those agents could perform better in the later stage of the search. Therefore, global-local communication becomes more critical in such situations for the entire system to converge. The proposed model can serve as a benchmark for empirical studies to quantitatively gauge how humans would rationally make design decisions in a team.  more » « less
Award ID(s):
2321463 2419423
PAR ID:
10538707
Author(s) / Creator(s):
; ;
Publisher / Repository:
American Society of Mechanical Engineers
Date Published:
Journal Name:
Journal of Mechanical Design
Volume:
147
Issue:
1
ISSN:
1050-0472
Subject(s) / Keyword(s):
design space exploration multi-agent Bayesian optimization (MABO) multi-agent reinforcement learning (MARL) black-box optimization collaborative design
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In multi-agent Bayesian optimization for Design Space Exploration (DSE), identifying a communication network among agents to share useful design information for enhanced cooperation and performance, considering the trade-off between connectivity and cost, poses significant challenges. To address this challenge, we develop a distributed multi-agent Bayesian optimization (DMABO) framework and study how communication network structures/connectivity and the resulting cost would impact the performance of a team of agents when finding the global optimum. Specifically, we utilize Lloyd’s algorithm to partition the design space to assign distinct regions to individual agents for exploration in the distributed multi-agent system (MAS). Based on this partitioning, we generate communication networks among agents using two models: 1) a range-limited model of communication constrained by neighborhood information; and 2) a range-free model without neighborhood constraints. We introduce network density as a metric to quantify communication costs. Then, we generate communication networks by gradually increasing the network density to assess the impact of communication costs on the performance of MAS in DSE. The experimental results show that the communication network based on the range-limited model can significantly improve performance without incurring high communication costs. This indicates that increasing the density of a communication network does not necessarily improve MAS performance in DSE. Furthermore, the results indicate that communication is only beneficial for team performance if it occurs between specific agents whose search regions are critically relevant to the location of the global optimum. The proposed DMABO framework and the insights obtained can help identify the best trade-off between communication structure and cost for MAS in unknown design space exploration. 
    more » « less
  2. We consider the problem of detecting norm violations in open multi-agent systems (MAS). We show how, using ideas from scrip systems, we can design mechanisms where the agents comprising the MAS are incentivised to monitor the actions of other agents for norm violations. The cost of providing the incentives is not borne by the MAS and does not come from fines charged for norm violations (fines may be impossible to levy in a system where agents are free to leave and rejoin again under a different identity). Instead, monitoring incentives come from (scrip) fees for accessing the services provided by the MAS. In some cases, perfect monitoring (and hence enforcement) can be achieved: no norms will be violated in equilibrium. In other cases, we show that, while it is impossible to achieve perfect enforcement, we can get arbitrarily close; we can make the probability of a norm violation in equilibrium arbitrarily small. We show using simulations that our theoretical results, which apply to systems with a large number of agents, hold for multi-agent systems with as few as 1000 agents–the system rapidly converges to the steady-state distribution of scrip tokens necessary to ensure monitoring and then remains close to the steady state. 
    more » « less
  3. We consider the problem of detecting norm violations in open multi-agent systems (MAS). We show how, using ideas from \emph{scrip systems}, we can design mechanisms where the agents comprising the MAS are incentivised to monitor the actions of other agents for norm violations. The cost of providing the incentives is not borne by the MAS and does not come from fines charged for norm violations (fines may be impossible to levy in a system where agents are free to leave and rejoin again under a different identity). Instead, monitoring incentives come from (scrip) fees for accessing the services provided by the MAS. In some cases, perfect monitoring (and hence enforcement) can be achieved: no norms will be violated in equilibrium. In other cases, we show that, while it is impossible to achieve perfect enforcement, we can get arbitrarily close; we can make the probability of a norm violation in equilibrium arbitrarily small. We show using simulations that our theoretical results, which apply to systems with a large number of agents, hold for multi-agent systems with as few as 1000 agents---the system rapidly converges to the steady-state distribution of scrip tokens necessary to ensure monitoring and then remains close to the steady state. 
    more » « less
  4. Several recent works have found the emergence of grounded com-positional language in the communication protocols developed bymostly cooperative multi-agent systems when learned end-to-endto maximize performance on a downstream task. However, humanpopulations learn to solve complex tasks involving communicativebehaviors not only in fully cooperative settings but also in scenar-ios where competition acts as an additional external pressure forimprovement. In this work, we investigate whether competitionfor performance from an external, similar agent team could actas a social influence that encourages multi-agent populations todevelop better communication protocols for improved performance,compositionality, and convergence speed. We start fromTask &Talk, a previously proposed referential game between two coopera-tive agents as our testbed and extend it intoTask, Talk & Compete,a game involving two competitive teams each consisting of twoaforementioned cooperative agents. Using this new setting, we pro-vide an empirical study demonstrating the impact of competitiveinfluence on multi-agent teams. Our results show that an externalcompetitive influence leads to improved accuracy and generaliza-tion, as well as faster emergence of communicative languages thatare more informative and compositional. 
    more » « less
  5. We introduce a sequential Bayesian binary hypothesis testing problem under social learning, termed selfish learning, where agents work to maximize their individual rewards. In particular, each agent receives a private signal and is aware of decisions made by earlier-acting agents. Beside inferring the underlying hypothesis, agents also decide whether to stop and declare, or pass the inference to the next agent. The employer rewards only correct responses and the reward per worker decreases with the number of employees used for decision making. We characterize decision regions of agents in the infinite and finite horizon. In particular, we show that the decision boundaries in the infinite horizon are the solutions to a Markov Decision Process with discounted costs, and can be solved using value iteration. In the finite horizon, we show that team performance is enhanced upon appropriate incentivization when compared to sequential social learning. 
    more » « less