skip to main content


Title: The Sabre Narrative Planner: Multi-Agent Coordination with Intentions and Beliefs
Sabre is a narrative planner—a centralized, omniscient decision maker that solves a multi-agent storytelling problem. The planner has an author goal it must achieve, but every action taken by an agent must make sense according to that agent’s individual intentions and limited, possibly wrong beliefs. This paper describes the implementation of Sabre, which supports a rich action syntax and imposes no arbitrary limit on the depth of theory of mind. We present a search procedure for generating plans that achieve the author goals while ensuring all agent actions are explained, and we report the system’s performance on several narrative planning benchmark problems.  more » « less
Award ID(s):
1911053
NSF-PAR ID:
10300907
Author(s) / Creator(s):
;
Editor(s):
Endriss, U.; Nowé, A.; Dignum, F.; Lomuscio, A.
Date Published:
Journal Name:
AAMAS Conference proceedings
ISSN:
2523-5699
Page Range / eLocation ID:
1698-1700
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Thue, David ; Ware, Stephen G. (Ed.)
    Sabre is a narrative planner—a centralized, omniscient decision maker that solves a multi-agent storytelling problem. The planner has an author goal it must achieve, but every action taken by an agent must make sense according to that agent’s individual intentions and limited, possibly wrong beliefs. This paper describes the implementation of Sabre, which supports a rich action syntax and imposes no arbitrary limit on the depth of theory of mind. We present a search procedure for generating plans that achieve the author goals while ensuring all agent actions are explained, and we report the system’s performance on several narrative planning benchmark problems. 
    more » « less
  2. A valid and believable narrative plan must often meet at least two requirements: the author’s goal must be satisfied by the end, and every action taken must make sense based on the goals and beliefs of the characters who take them. Many narrative planners are based on progression, or forward search through the space of possible states. When reasoning about goals and beliefs, progression can be wasteful, because either the planner needs to satisfy the author’s goal first and then explain actions, backtracking when an explanation cannot be found, or explain actions as they are taken, which may waste effort explaining actions that are not relevant to the author’s goal. We propose that regression, or backward search from goals, can address this problem. Regression ensures that every action sequence is intentional and only reasons about the agent beliefs needed for a plan to make sense. 
    more » « less
  3. We pose and study the problem of planning in Markov decision processes (MDPs), subject to participation constraints as studied in mechanism design. In this problem, a planner must work with a self-interested agent on a given MDP. Each action in the MDP provides an immediate reward to the planner and a (possibly different) reward to the agent. The agent has no control in choosing the actions, but has the option to end the entire process at any time. The goal of the planner is to find a policy that maximizes her cumulative reward, taking into consideration the agent's ability to terminate. We give a fully polynomial-time approximation scheme for this problem. En route, we present polynomial-time algorithms for computing (exact) optimal policies for important special cases of this problem, including when the time horizon is constant, or when the MDP exhibits a "definitive decisions" property. We illustrate our algorithms with two different game-theoretic applications: the problem of assigning rides in ride-sharing and the problem of designing screening policies. Our results imply efficient algorithms for computing (approximately) optimal policies in both applications. 
    more » « less
  4. Pedagogical planners can provide adaptive support to students in narrative-centered learning environments by dynamically scaffolding student learning and tailoring problem scenarios. Reinforcement learning (RL) is frequently used for pedagogical planning in narrative-centered learning environments. However, RL-based pedagogical planning raises significant challenges due to the scarcity of data for training RL policies. Most prior work has relied on limited-size datasets and offline RL techniques for policy learning. Unfortunately, offline RL techniques do not support on-demand exploration and evaluation, which can adversely impact the quality of induced policies. To address the limitation of data scarcity and offline RL, we propose INSIGHT, an online RL framework for training data-driven pedagogical policies that optimize student learning in narrative-centered learning environments. The INSIGHT framework consists of three components: a narrative-centered learning environment simulator, a simulated student agent, and an RL-based pedagogical planner agent, which uses a reward metric that is associated with effective student learning processes. The framework enables the generation of synthetic data for on-demand exploration and evaluation of RL-based pedagogical planning. We have implemented INSIGHT with OpenAI Gym for a narrative-centered learning environment testbed with rule-based simulated student agents and a deep Q-learning-based pedagogical planner. Our results show that online deep RL algorithms can induce near-optimal pedagogical policies in the INSIGHT framework, while offline deep RL algorithms only find suboptimal policies even with large amounts of data.

     
    more » « less
  5. In open agent systems, the set of agents that are cooperating or competing changes over time and in ways that are nontrivial to predict. For example, if collaborative robots were tasked with fighting wildfires, they may run out of suppressants and be temporarily unavailable to assist their peers. We consider the problem of planning in these contexts with the additional challenges that the agents are unable to communicate with each other and that there are many of them. Because an agent's optimal action depends on the actions of others, each agent must not only predict the actions of its peers, but, before that, reason whether they are even present to perform an action. Addressing openness thus requires agents to model each other's presence, which becomes computationally intractable with high numbers of agents. We present a novel, principled, and scalable method in this context that enables an agent to reason about others' presence in its shared environment and their actions. Our method extrapolates models of a few peers to the overall behavior of the many-agent system, and combines it with a generalization of Monte Carlo tree search to perform individual agent reasoning in many-agent open environments. Theoretical analyses establish the number of agents to model in order to achieve acceptable worst case bounds on extrapolation error, as well as regret bounds on the agent's utility from modeling only some neighbors. Simulations of multiagent wildfire suppression problems demonstrate our approach's efficacy compared with alternative baselines. 
    more » « less