Teamwork is a set of interrelated reasoning, actions and behaviors of team members that facilitate common objectives. Teamwork theory and experiments have resulted in a set of states and processes for team effectiveness in both human-human and agent-agent teams. However, human-agent teaming is less well studied because it is so new and involves asymmetry in policy and intent not present in human teams. To optimize team performance in human-agent teaming, it is critical that agents infer human intent and adapt their polices for smooth coordination. Most literature in human-agent teaming builds agents referencing a learned human model. Though these agents are guaranteed to perform well with the learned model, they lay heavy assumptions on human policy such as optimality and consistency, which is unlikely in many real-world scenarios. In this paper, we propose a novel adaptive agent architecture in human-model-free setting on a two-player cooperative game, namely Team Space Fortress (TSF). Previous human-human team research have shown complementary policies in TSF game and diversity in human players’ skill, which encourages us to relax the assumptions on human policy. Therefore, we discard learning human models from human data, and instead use an adaptation strategy on a pre-trained library of exemplar policies composed of RL algorithms or rule-based methods with minimal assumptions of human behavior. The adaptation strategy relies on a novel similarity metric to infer human policy and then selects the most complementary policy in our library to maximize the team performance. The adaptive agent architecture can be deployed in real-time and generalize to any off-the-shelf static agents. We conducted human-agent experiments to evaluate the proposed adaptive agent framework, and demonstrated the suboptimality, diversity, and adaptability of human policies in human-agent teams.
more »
« less
Using Human-Agent Teams to Purposefully Design Multi-Agent Systems
With multi-agent teams becoming more of a reality every day, it is important to create a common design model for multi-agent teams. These teams need to be able to function in dynamic environments and still communicate with any humans that may need a problem solved. Existing human-agent research can be used to purposefully create multi-agent teams that are interdependent but can still interact with humans. Rather than creating dynamic agents, the most effective way to overcome the dynamic nature of modern workloads is to create a dynamic team configuration, rather than individual member-agents that can change their roles. Multi-agent teams will require a variety of agents to be designed to cover a diverse subset of problems that need to be solved in the modern workforce. A model based on existing multi-agent teams that satisfies the needs of human-agent teams has been created to serve as a baseline for human-interactive multi-agent teams.
more »
« less
- Award ID(s):
- 1829008
- PAR ID:
- 10184348
- Date Published:
- Journal Name:
- Proceedings of the Human Factors and Ergonomics Society Annual Meeting
- Volume:
- 63
- Issue:
- 1
- ISSN:
- 2169-5067
- Page Range / eLocation ID:
- 1425 to 1429
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Multi-Agent Path Finding (MA-PF) computes a set of collision-free paths for multiple agents from their respective starting locations to destinations. This paper considers a generalization of MA-PF called Multi-Agent Teamwise Cooperative Path Finding (MA-TC-PF), where agents are grouped as multiple teams and each team has its own objective to be minimized. For example, an objective can be the sum or max of individual arrival times of the agents. In general, there is more than one team, and MA-TC-PF is thus a multi-objective planning problem with the goal of finding the entire Paretooptimal front that represents all possible trade-offs among the objectives of the teams. To solve MA-TC-PF, we propose two algorithms TC-CBS and TC-M*, which leverage the existing CBS and M* for conventional MA-PF. We discuss the conditions under which the proposed algorithms are complete and are guaranteed to find the Pareto-optimal front. We present numerical results for several types of MA-TC-PF problems.more » « less
-
Trust has been identified as a central factor for effective human-robot teaming. Existing literature on trust modeling predominantly focuses on dyadic human-autonomy teams where one human agent interacts with one robot. There is little, if not no, research on trust modeling in teams consisting of multiple human agents and multiple robotic agents. To fill this research gap, we present the trust inference and propagation (TIP) model for trust modeling in multi-human multi-robot teams. We assert that in a multi-human multi-robot team, there exist two types of experiences that any human agent has with any robot: direct and indirect experiences. The TIP model presents a novel mathematical framework that explicitly accounts for both types of experiences. To evaluate the model, we conducted a human-subject experiment with 15 pairs of participants (N=30). Each pair performed a search and detection task with two drones. Results show that our TIP model successfully captured the underlying trust dynamics and significantly outperformed a baseline model. To the best of our knowledge, the TIP model is the first mathematical framework for computational trust modeling in multi-human multi-robot teams.more » « less
-
Effective coordination of design teams must account for the influence of costs incurred while searching for the best design solutions. This article introduces a cost-aware multi-agent system (MAS), a theoretical model to (1) explain how individuals in a team should search, assuming that they are all rational utility-maximizing decision-makers and (2) study the impact of cost on the search performance of both individual agents and the system. First, we develop a new multi-agent Bayesian optimization framework accounting for information exchange among agents to support their decisions on where to sample in search. Second, we employ a reinforcement learning approach based on the multi-agent deep deterministic policy gradient for training MAS to identify where agents cannot sample due to design constraints. Third, we propose a new cost-aware stopping criterion for each agent to determine when costs outweigh potential gains in search as a criterion to stop. Our results indicate that cost has a more significant impact on MAS communication in complex design problems than in simple ones. For example, when searching in complex design spaces, some agents could initially have low-performance gains, thus stopping prematurely due to negative payoffs, even if those agents could perform better in the later stage of the search. Therefore, global-local communication becomes more critical in such situations for the entire system to converge. The proposed model can serve as a benchmark for empirical studies to quantitatively gauge how humans would rationally make design decisions in a team.more » « less
-
As deep reinforcement learning (RL) is applied to more tasks, there is a need to visualize and understand the behavior of learned agents. Saliency maps explain agent behavior by highlighting the features of the input state that are most relevant for the agent in taking an action. Existing perturbation-based approaches to compute saliency often highlight regions of the input that are not relevant to the action taken by the agent. Our proposed approach, SARFA (Specific and Relevant Feature Attribution), generates more focused saliency maps by balancing two aspects (specificity and relevance) that capture different desiderata of saliency. The first captures the impact of perturbation on the relative expected reward of the action to be explained. The second downweighs irrelevant features that alter the relative expected rewards of actions other than the action to be explained. We compare SARFA with existing approaches on agents trained to play board games (Chess and Go) and Atari games (Breakout, Pong and Space Invaders). We show through illustrative examples (Chess, Atari, Go), human studies (Chess), and automated evaluation methods (Chess) that SARFA generates saliency maps that are more interpretable for humans than existing approaches.more » « less