skip to main content

Title: Pragmatic-Pedagogic Value Alignment
As intelligent systems gain autonomy and capability, it becomes vital to ensure that their objectives match those of their human users; this is known as the value-alignment problem. In robotics, value alignment is key to the design of collaborative robots that can integrate into human workflows, successfully inferring and adapting to their users’ objectives as they go.We argue that a meaningful solution to value alignment must combine multi-agent decision theory with rich mathematical models of human cognition, enabling robots to tap into people’s natural collaborative capabilities. We present a solution to the cooperative inverse reinforcement learning (CIRL) dynamic game based on well-established cognitive models of decision making and theory of mind. The solution captures a key reciprocity relation: the human will not plan her actions in isolation, but rather reason pedagogically about how the robot might learn from them; the robot, in turn, can anticipate this and interpret the human’s actions pragmatically. To our knowledge, this work constitutes the first formal analysis of value alignment grounded in empirically validated cognitive models.
Authors:
; ; ; ; ; ; ; ; ;
Award ID(s):
1734633
Publication Date:
NSF-PAR ID:
10063837
Journal Name:
International Symposium on Robotics Research (ISRR)
Sponsoring Org:
National Science Foundation
More Like this
  1. A prerequisite for social coordination is bidirectional communication between teammates, each playing two roles simultaneously: as receptive listeners and expressive speakers. For robots working with humans in complex situations with multiple goals that differ in importance, failure to fulfill the expectation of either role could undermine group performance due to misalignment of values between humans and robots. Specifically, a robot needs to serve as an effective listener to infer human users’ intents from instructions and feedback and as an expressive speaker to explain its decision processes to users. Here, we investigate how to foster effective bidirectional human-robot communications in the context of value alignment—collaborative robots and users form an aligned understanding of the importance of possible task goals. We propose an explainable artificial intelligence (XAI) system in which a group of robots predicts users’ values by taking in situ feedback into consideration while communicating their decision processes to users through explanations. To learn from human feedback, our XAI system integrates a cooperative communication model for inferring human values associated with multiple desirable goals. To be interpretable to humans, the system simulates human mental dynamics and predicts optimal explanations using graphical models. We conducted psychological experiments to examine the core componentsmore »of the proposed computational framework. Our results show that real-time human-robot mutual understanding in complex cooperative tasks is achievable with a learning model based on bidirectional communication. We believe that this interaction framework can shed light on bidirectional value alignment in communicative XAI systems and, more broadly, in future human-machine teaming systems.« less
  2. Explainability has emerged as a critical AI research objective, but the breadth of proposed methods and application domains suggest that criteria for explanation vary greatly. In particular, what counts as a good explanation, and what kinds of explanation are computationally feasible, has become trickier in light of oqaque “black box” systems such as deep neural networks. Explanation in such cases has drifted from what many philosophers stipulated as having to involve deductive and causal principles to mere “interpretation,” which approximates what happened in the target system to varying degrees. However, such post hoc constructed rationalizations are highly problematic for social robots that operate interactively in spaces shared with humans. For in such social contexts, explanations of behavior, and, in particular, justifications for violations of expected behavior, should make reference to socially accepted principles and norms. In this article, we show how a social robot’s actions can face explanatory demands for how it came to act on its decision, what goals, tasks, or purposes its design had those actions pursue and what norms or social constraints the system recognizes in the course of its action. As a result, we argue that explanations for social robots will need to be accurate representationsmore »of the system’s operation along causal, purposive, and justificatory lines. These explanations will need to generate appropriate references to principles and norms—explanations based on mere “interpretability” will ultimately fail to connect the robot’s behaviors to its appropriate determinants. We then lay out the foundations for a cognitive robotic architecture for HRI, together with particular component algorithms, for generating explanations and engaging in justificatory dialogues with human interactants. Such explanations track the robot’s actual decision-making and behavior, which themselves are determined by normative principles the robot can describe and use for justifications.« less
  3. Wagner, A.R. (Ed.)
    Collaborative robots that provide anticipatory assistance are able to help people complete tasks more quickly. As anticipatory assistance is provided before help is explicitly requested, there is a chance that this action itself will influence the person’s future decisions in the task. In this work, we investigate whether a robot’s anticipatory assistance can drive people to make choices different from those they would otherwise make. Such a study requires measuring intent, which itself could modify intent, resulting in an observer paradox. To combat this, we carefully designed an experiment to avoid this effect. We considered several mitigations such as the careful choice of which human behavioral signals we use to measure intent and designing unobtrusive ways to obtain these signals. We conducted a user study (𝑁=99) in which participants completed a collaborative object retrieval task: users selected an object and a robot arm retrieved it for them. The robot predicted the user’s object selection from eye gaze in advance of their explicit selection, and then provided either collaborative anticipation (moving toward the predicted object), adversarial anticipation (moving away from the predicted object), or no anticipation (no movement, control condition). We found trends and participant comments suggesting people’s decision making changesmore »in the presence of a robot anticipatory motion and this change differs depending on the robot’s anticipation strategy.« less
  4. Many real-life scenarios require humans to make difficult trade-offs: do we always follow all the traffic rules or do we violate the speed limit in an emergency? In general, how should we account for and balance the ethical values, safety recommendations, and societal norms, when we are trying to achieve a certain objective? To enable effective AI-human collaboration, we must equip AI agents with a model of how humans make such trade-offs in environments where there is not only a goal to be reached, but there are also ethical constraints to be considered and to possibly align with. These ethical constraints could be both deontological rules on actions that should not be performed, or also consequentialist policies that recommend avoiding reaching certain states of the world. Our purpose is to build AI agents that can mimic human behavior in these ethically constrained decision environments, with a long term research goal to use AI to help humans in making better moral judgments and actions. To this end, we propose a computational approach where competing objectives and ethical constraints are orchestrated through a method that leverages a cognitive model of human decision making, called multi-alternative decision field theory (MDFT). Using MDFT, wemore »build an orchestrator, called MDFT-Orchestrator (MDFT-O), that is both general and flexible. We also show experimentally that MDFT-O both generates better decisions than using a heuristic that takes a weighted average of competing policies (WA-O), but also performs better in terms of mimicking human decisions as collected through Amazon Mechanical Turk (AMT). Our methodology is therefore able to faithfully model human decision in ethically constrained decision environments.« less
  5. Robots have begun operating and collaborating with humans in industrial and social settings. This collaboration introduces challenges: the robot must plan while taking the human’s actions into account. In prior work, the problem was posed as a 2-player deterministic game, with a limited number of human moves. The limit on human moves is unintuitive, and in many settings determinism is undesirable. In this paper, we present a novel planning method for collaborative human-robot manipulation tasks via probabilistic synthesis. We introduce a probabilistic manipulation domain that captures the interaction by allowing for both robot and human actions with states that represent the configurations of the objects in the workspace. The task is specified using Linear Temporal Logic over finite traces (LTLf ). We then transform our manipulation domain into a Markov Decision Process (MDP) and synthesize an optimal policy to satisfy the specification on this MDP. We present two novel contributions: a formalization of probabilistic manipulation domains allowing us to apply existing techniques and a comparison of different encodings of these domains. Our framework is validated on a physical UR5 robot.