skip to main content

Title: Adaptive Agent Architecture for Real-time Human-Agent Teaming
Teamwork is a set of interrelated reasoning, actions and behaviors of team members that facilitate common objectives. Teamwork theory and experiments have resulted in a set of states and processes for team effectiveness in both human-human and agent-agent teams. However, human-agent teaming is less well studied because it is so new and involves asymmetry in policy and intent not present in human teams. To optimize team performance in human-agent teaming, it is critical that agents infer human intent and adapt their polices for smooth coordination. Most literature in human-agent teaming builds agents referencing a learned human model. Though these agents are guaranteed to perform well with the learned model, they lay heavy assumptions on human policy such as optimality and consistency, which is unlikely in many real-world scenarios. In this paper, we propose a novel adaptive agent architecture in human-model-free setting on a two-player cooperative game, namely Team Space Fortress (TSF). Previous human-human team research have shown complementary policies in TSF game and diversity in human players’ skill, which encourages us to relax the assumptions on human policy. Therefore, we discard learning human models from human data, and instead use an adaptation strategy on a pre-trained library of exemplar policies composed of RL algorithms or rule-based methods with minimal assumptions of human behavior. The adaptation strategy relies on a novel similarity metric to infer human policy and then selects the most complementary policy in our library to maximize the team performance. The adaptive agent architecture can be deployed in real-time and generalize to any off-the-shelf static agents. We conducted human-agent experiments to evaluate the proposed adaptive agent framework, and demonstrated the suboptimality, diversity, and adaptability of human policies in human-agent teams.  more » « less
Award ID(s):
Author(s) / Creator(s):
Date Published:
Journal Name:
AAAI Workshop on Plan, Activity, and Intent Recognition
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Agmon, N ; An, B ; Ricci, A ; Yeoh, W. (Ed.)
    In multiagent systems that require coordination, agents must learn diverse policies that enable them to achieve their individual and team objectives. Multiagent Quality-Diversity methods partially address this problem by filtering the joint space of policies to smaller sub-spaces that make the diversification of agent policies tractable. However, in teams of asymmetric agents (agents with different objectives and capabilities), the search for diversity is primarily driven by the need to find policies that will allow agents to assume complementary roles required to work together in teams. This work introduces Asymmetric Island Model (AIM), a multiagent framework that enables populations of asymmetric agents to learn diverse complementary policies that foster teamwork via dynamic population size allocation on a wide variety of team tasks. The key insight of AIM is that the competitive pressure arising from the distribution of policies on different team-wide tasks drives the agents to explore regions of the policy space that yield specializations that generalize across tasks. Simulation results on multiple variations of a remote habitat problem highlight the strength of AIM in discovering robust synergies that allow agents to operate near-optimally in response to the changing team composition and policies of other agents. 
    more » « less
  2. null (Ed.)
    Cooperative Co-evolutionary Algorithms effectively train policies in multiagent systems with a single, statically defined team. However, many real-world problems, such as search and rescue, require agents to operate in multiple teams. When the structure of the team changes, these policies show reduced performance as they were trained to cooperate with only one team. In this work, we solve the cooperation problem by training agents to fill the needs of an arbitrary team, thereby gaining the ability to support a large variety of teams. We introduce Ad hoc Teaming Through Evolution (ATTE) which evolves a limited number of policy types using fitness aggregation across multiple teams. ATTE leverages agent types to reduce the dimensionality of the interaction search space, while fitness aggregation across teams selects for more adaptive policies. In a simulated multi-robot exploration task, ATTE is able to learn policies that are effective in a variety of teaming schemes, improving the performance of CCEA by a factor of up to five times. 
    more » « less
  3. Effective human-human and human-autonomy teamwork is critical but often challenging to perfect. The challenge is particularly relevant in time-critical domains, such as healthcare and disaster response, where the time pressures can make coordination increasingly difficult to achieve and the consequences of imperfect coordination can be severe. To improve teamwork in these and other domains, we present TIC: an automated intervention approach for improving coordination between team members. Using BTIL, a multi-agent imitation learning algorithm, our approach first learns a generative model of team behavior from past task execution data. Next, it utilizes the learned generative model and team's task objective (shared reward) to algorithmically generate execution-time interventions. We evaluate our approach in synthetic multi-agent teaming scenarios, where team members make decentralized decisions without full observability of the environment. The experiments demonstrate that the automated interventions can successfully improve team performance and shed light on the design of autonomous agents for improving teamwork. 
    more » « less
  4. Several recent works have found the emergence of grounded com-positional language in the communication protocols developed bymostly cooperative multi-agent systems when learned end-to-endto maximize performance on a downstream task. However, humanpopulations learn to solve complex tasks involving communicativebehaviors not only in fully cooperative settings but also in scenar-ios where competition acts as an additional external pressure forimprovement. In this work, we investigate whether competitionfor performance from an external, similar agent team could actas a social influence that encourages multi-agent populations todevelop better communication protocols for improved performance,compositionality, and convergence speed. We start fromTask &Talk, a previously proposed referential game between two coopera-tive agents as our testbed and extend it intoTask, Talk & Compete,a game involving two competitive teams each consisting of twoaforementioned cooperative agents. Using this new setting, we pro-vide an empirical study demonstrating the impact of competitiveinfluence on multi-agent teams. Our results show that an externalcompetitive influence leads to improved accuracy and generaliza-tion, as well as faster emergence of communicative languages thatare more informative and compositional. 
    more » « less
  5. Andreas Krause, Emma Brunskill (Ed.)
    Executing actions in a correlated manner is a common strategy for human coordination that often leads to better cooperation, which is also potentially beneficial for cooperative multi-agent reinforcement learning (MARL). However, the recent success of MARL relies heavily on the convenient paradigm of purely decentralized execution, where there is no action correlation among agents for scalability considerations. In this work, we introduce a Bayesian network to inaugurate correlations between agents’ action selections in their joint policy. Theoretically, we establish a theoretical justification for why action dependencies are beneficial by deriving the multi-agent policy gradient formula under such a Bayesian network joint policy and proving its global convergence to Nash equilibria under tabular softmax policy parameterization in cooperative Markov games. Further, by equipping existing MARL algorithms with a recent method of differentiable directed acyclic graphs (DAGs), we develop practical algorithms to learn the context-aware Bayesian network policies in scenarios with partial observability and various difficulty. We also dynamically decrease the sparsity of the learned DAG throughout the training process, which leads to weakly or even purely independent policies for decentralized execution. Empirical results on a range of MARL benchmarks show the benefits of our approach. 
    more » « less