There are many settings in which a principal performs a task by delegating it to an agent, who searches over possible solutions and proposes one to the principal. This describes many aspects of the workflow within organizations, as well as many of the activities undertaken by regulatory bodies, who often obtain relevant information from the parties being regulated through a process of delegation. A fundamental tension underlying delegation is the fact that the agent's interests will typically differ -- potentially significantly -- from the interests of the principal, and as a result the agent may propose solutions based on their own incentives that are inefficient for the principal. A basic problem, therefore, is to design mechanisms by which the principal can constrain the set of proposals they are willing to accept from the agent, to ensure a certain level of quality for the principal from the proposed solution. In this work, we investigate how much the principal loses -- quantitatively, in terms of the objective they are trying to optimize -- when they delegate to an agent. We develop a methodology for bounding this loss of efficiency, and show that in a very general model of delegation, there is a family of mechanisms achieving a universal bound on the ratio between the quality of the solution obtained through delegation and the quality the principal could obtain in an idealized benchmark where they searched for a solution themself. Moreover, it is possible to achieve such bounds through mechanisms with a natural threshold structure, which are thus structurally simpler than the optimal mechanisms typically considered in the literature on delegation. At the heart of our framework is an unexpected connection between delegation and the analysis of prophet inequalities, which we leverage to provide bounds on the behavior of our delegation mechanisms.
more »
« less
ReDel: A Toolkit for LLM-Powered Recursive Multi-Agent Systems
Recently, there has been increasing interest in using Large Language Models (LLMs) to construct complex multi-agent systems to perform tasks such as compiling literature reviews, drafting consumer reports, and planning vacations. Many tools and libraries exist for helping create such systems, however none support *recursive* multi-agent systems—where the models themselves flexibly decide when to delegate tasks and how to organize their delegation structure. In this work, we introduce ReDel: a toolkit for recursive multi-agent systems that supports custom tool-use, delegation schemes, event-based logging, and interactive replay in an easy-to-use web interface. We show that, using ReDel, we are able to achieve significant performance gains on agentic benchmarks and easily identify potential areas of improvements through the visualization and debugging tools. Our code, documentation, and PyPI package are open-source at https://github.com/zhudotexe/redel, and free to use under the MIT license.
more »
« less
- Award ID(s):
- 1928474
- PAR ID:
- 10563493
- Publisher / Repository:
- Association for Computational Linguistics
- Date Published:
- Page Range / eLocation ID:
- 162 to 171
- Subject(s) / Keyword(s):
- LLMs Agents Multi-Agent Systems
- Format(s):
- Medium: X
- Location:
- Miami, Florida, USA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Large Language Models (LLMs) have achieved remarkable success across a wide array of tasks. Due to the impressive planning and reasoning abilities of LLMs, they have been used as autonomous agents to do many tasks automatically. Recently, based on the development of using one LLM as a single planning or decision-making agent, LLM-based multi-agent systems have achieved considerable progress in complex problem-solving and world simulation. To provide the community with an overview of this dynamic field, we present this survey to offer an in-depth discussion on the essential aspects of multi-agent systems based on LLMs, as well as the challenges. Our goal is for readers to gain substantial insights on the following questions: What domains and environments do LLM-based multi-agents simulate? How are these agents profiled and how do they communicate? What mechanisms contribute to the growth of agents' capacities? For those interested in delving into this field of study, we also summarize the commonly used datasets or benchmarks for them to have convenient access. To keep researchers updated on the latest studies, we maintain an open-source GitHub repository, dedicated to outlining the research on LLM-based multi-agent systems.more » « less
-
Fine-grained visual reasoning tasks in multi-agent environments such as event prediction, agent type identification, or missing data imputation are important for multiple applications (e.g., autonomous surveillance over sensor networks and subtasks for reinforcement learning (RL)). StarCraft II game replays encode intelligent (and adversarial) multi-agent behavior and could provide a testbed for these tasks; however, extracting simple and standardized representations for prototyping these tasks is laborious and hinders reproducibility. In contrast, MNIST and CIFAR10, despite their extreme simplicity, have enabled rapid prototyping and reproducibility of ML methods. Following the simplicity of these datasets, we construct a benchmark fine-grained multi-agent categorization dataset based on StarCraft II replays that exhibit complex multi-agent behaviors, while still being as easy to use as MNIST and CIFAR10. Specifically, we carefully summarize a window of 255 consecutive game states to create 3.6 million summary images from 60,000 replays, including all relevant metadata such as game outcome and player races. We develop three formats of decreasing complexity: Hyperspectral images that include one channel for every unit type (similar to multispectral geospatial images), RGB images that mimic CIFAR10, and grayscale images that mimic MNIST. We show how this dataset can be used for prototyping fine-grained multi-agent categorization methods. All datasets, code for extraction, and code for dataset loading can be found at https://starcraftdata.davidinouye.com/.</p>more » « less
-
Successful collaboration requires team members to stay aligned, especially in complex sequential tasks. Team members must dynamically coordinate which subtasks to perform and in what order. However, real-world constraints like partial observability and limited communication bandwidth often lead to suboptimal collaboration. Even among expert teams, the same task can be executed in multiple ways. To develop multi-agent systems and human-AI teams for such tasks, we are interested in data-driven learning of multimodal team behaviors. Multi-Agent Imitation Learning (MAIL) provides a promising framework for data-driven learning of team behavior from demonstrations, but existing methods struggle with heterogeneous demonstrations, as they assume that all demonstrations originate from a single team policy. Hence, in this work, we introduce DTIL: a hierarchical MAIL algorithm designed to learn multimodal team behaviors in complex sequential tasks. DTIL represents each team member with a hierarchical policy and learns these policies from heterogeneous team demonstrations in a factored manner. By employing a distribution-matching approach, DTIL mitigates compounding errors and scales effectively to long horizons and continuous state representations. Experimental results show that DTIL outperforms MAIL baselines and accurately models team behavior across a variety of collaborative scenarios.more » « less
-
The emergency of machine type and ultra-reliable low latency communication is imposing stringent constraints for service provisioning. Addressing such constraints is challenging for network and cloud service providers. As a trending paradigm, software-defined networking (SDN) plays a significant role in future networks and services. However, the classical implementation of the SDN controller has limitations in-terms-of latency and reliability since the controller is decoupled from the forwarding device. Several research works have tried to tackle these challenges by proposing solutions such as Devoflow, DIFANE, and hierarchical and distributed controller deployment. Nonetheless, these approaches are not fully addressing these challenges. This paper tries to address the problem of latency and reliability by proposing a dynamic controller role delegation architecture for forwarding devices. To align with the microservice or multi-agent-based service-based architecture, the role delegation function as a service is proposed. The dynamic role delegation enables to predict and (pre-)installed flow rules in the forwarding devices based on various considerations such as network state, packet type, and service's stringent requirements. The proposed architecture is implemented and evaluated for latency and resiliency performance in comparison to the centralized and distributed deployment of the SDN controller. We used ComNetsEmu, a softwarized network emulation tool, to emulate SDN and NFV (Network Function Virtualization). The result indicated a significant decrease in latency and improved resilience in case of failure, yielding better network performance.more » « less
An official website of the United States government

