skip to main content


This content will become publicly available on August 21, 2025

Title: A Cost-Aware Multi-Agent System for Black-Box Design Space Exploration
Effective coordination of design teams must account for the influence of costs incurred while searching for the best design solutions. This article introduces a cost-aware multi-agent system (MAS), a theoretical model to (1) explain how individuals in a team should search, assuming that they are all rational utility-maximizing decision-makers and (2) study the impact of cost on the search performance of both individual agents and the system. First, we develop a new multi-agent Bayesian optimization framework accounting for information exchange among agents to support their decisions on where to sample in search. Second, we employ a reinforcement learning approach based on the multi-agent deep deterministic policy gradient for training MAS to identify where agents cannot sample due to design constraints. Third, we propose a new cost-aware stopping criterion for each agent to determine when costs outweigh potential gains in search as a criterion to stop. Our results indicate that cost has a more significant impact on MAS communication in complex design problems than in simple ones. For example, when searching in complex design spaces, some agents could initially have low-performance gains, thus stopping prematurely due to negative payoffs, even if those agents could perform better in the later stage of the search. Therefore, global-local communication becomes more critical in such situations for the entire system to converge. The proposed model can serve as a benchmark for empirical studies to quantitatively gauge how humans would rationally make design decisions in a team.  more » « less
Award ID(s):
2321463 2419423
PAR ID:
10538707
Author(s) / Creator(s):
; ;
Publisher / Repository:
American Society of Mechanical Engineers
Date Published:
Journal Name:
Journal of Mechanical Design
Volume:
147
Issue:
1
ISSN:
1050-0472
Subject(s) / Keyword(s):
design space exploration multi-agent Bayesian optimization (MABO) multi-agent reinforcement learning (MARL) black-box optimization collaborative design
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider the problem of detecting norm violations in open multi-agent systems (MAS). We show how, using ideas from scrip systems, we can design mechanisms where the agents comprising the MAS are incentivised to monitor the actions of other agents for norm violations. The cost of providing the incentives is not borne by the MAS and does not come from fines charged for norm violations (fines may be impossible to levy in a system where agents are free to leave and rejoin again under a different identity). Instead, monitoring incentives come from (scrip) fees for accessing the services provided by the MAS. In some cases, perfect monitoring (and hence enforcement) can be achieved: no norms will be violated in equilibrium. In other cases, we show that, while it is impossible to achieve perfect enforcement, we can get arbitrarily close; we can make the probability of a norm violation in equilibrium arbitrarily small. We show using simulations that our theoretical results, which apply to systems with a large number of agents, hold for multi-agent systems with as few as 1000 agents–the system rapidly converges to the steady-state distribution of scrip tokens necessary to ensure monitoring and then remains close to the steady state. 
    more » « less
  2. We consider the problem of detecting norm violations in open multi-agent systems (MAS). We show how, using ideas from \emph{scrip systems}, we can design mechanisms where the agents comprising the MAS are incentivised to monitor the actions of other agents for norm violations. The cost of providing the incentives is not borne by the MAS and does not come from fines charged for norm violations (fines may be impossible to levy in a system where agents are free to leave and rejoin again under a different identity). Instead, monitoring incentives come from (scrip) fees for accessing the services provided by the MAS. In some cases, perfect monitoring (and hence enforcement) can be achieved: no norms will be violated in equilibrium. In other cases, we show that, while it is impossible to achieve perfect enforcement, we can get arbitrarily close; we can make the probability of a norm violation in equilibrium arbitrarily small. We show using simulations that our theoretical results, which apply to systems with a large number of agents, hold for multi-agent systems with as few as 1000 agents---the system rapidly converges to the steady-state distribution of scrip tokens necessary to ensure monitoring and then remains close to the steady state. 
    more » « less
  3. Several recent works have found the emergence of grounded com-positional language in the communication protocols developed bymostly cooperative multi-agent systems when learned end-to-endto maximize performance on a downstream task. However, humanpopulations learn to solve complex tasks involving communicativebehaviors not only in fully cooperative settings but also in scenar-ios where competition acts as an additional external pressure forimprovement. In this work, we investigate whether competitionfor performance from an external, similar agent team could actas a social influence that encourages multi-agent populations todevelop better communication protocols for improved performance,compositionality, and convergence speed. We start fromTask &Talk, a previously proposed referential game between two coopera-tive agents as our testbed and extend it intoTask, Talk & Compete,a game involving two competitive teams each consisting of twoaforementioned cooperative agents. Using this new setting, we pro-vide an empirical study demonstrating the impact of competitiveinfluence on multi-agent teams. Our results show that an externalcompetitive influence leads to improved accuracy and generaliza-tion, as well as faster emergence of communicative languages thatare more informative and compositional. 
    more » « less
  4. We introduce a sequential Bayesian binary hypothesis testing problem under social learning, termed selfish learning, where agents work to maximize their individual rewards. In particular, each agent receives a private signal and is aware of decisions made by earlier-acting agents. Beside inferring the underlying hypothesis, agents also decide whether to stop and declare, or pass the inference to the next agent. The employer rewards only correct responses and the reward per worker decreases with the number of employees used for decision making. We characterize decision regions of agents in the infinite and finite horizon. In particular, we show that the decision boundaries in the infinite horizon are the solutions to a Markov Decision Process with discounted costs, and can be solved using value iteration. In the finite horizon, we show that team performance is enhanced upon appropriate incentivization when compared to sequential social learning. 
    more » « less
  5. Diversity in behaviors is instrumental for robust team performance in many multiagent tasks which require agents to coordinate. Unfortunately, exhaustive search through the agents’ behavior spaces is often intractable. This paper introduces Behavior Exploration for Heterogeneous Teams (BEHT), a multi-level learning framework that enables agents to progressively explore regions of the behavior space that promote team coordination on diverse goals. By combining diversity search to maximize agent-specific rewards and evolutionary optimization to maximize the team-based fitness, our method effectively filters regions of the behavior space that are conducive to agent coordination. We demonstrate the diverse behaviors and synergies that are method allows agents to learn on a multiagent exploration problem. 
    more » « less