Towards Uniformly Superhuman Autonomy via Subdominance Minimization

Ziebart, Brian; Choudhury, Sanjiban; Yan, Xinyan; Vernaza, Paul

Citation Details

Prevalent imitation learning methods seek to produce behavior that matches or exceeds average human performance. This often prevents achieving expert-level or superhuman performance when identifying the better demonstrations to imitate is difficult. We instead assume demonstrations are of varying quality and seek to induce behavior that is unambiguously better (i.e., Pareto dominant or minimally subdominant) than all human demonstrations. Our minimum subdominance inverse optimal control training objective is primarily defined by high quality demonstrations; lower quality demonstrations, which are more easily dominated, are effectively ignored instead of degrading imitation. With increasing probability, our approach produces superhuman behavior incurring lower cost than demonstrations on the demonstrator’s unknown cost function{—}even if that cost function differs for each demonstration. We apply our approach on a computer cursor pointing task, producing behavior that is 78% superhuman, while minimizing demonstration suboptimality provides 50% superhuman behavior{—}and only 72% even after selective data cleaning. more »

Award ID(s):: 1652530 1939743 1838770

PAR ID:: 10355186

Author(s) / Creator(s):: Ziebart, Brian; Choudhury, Sanjiban; Yan, Xinyan; Vernaza, Paul

Date Published:: 2022-07-01

Journal Name:: International Conference on Machine Learning

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this