Leveraging Human Input to Enable Robust, Interactive, and Aligned AI Systems

Brown, Daniel S

doi:10.1609/aaai.v39i27.35099

Citation Details

This content will become publicly available on April 11, 2026

Leveraging Human Input to Enable Robust, Interactive, and Aligned AI Systems

Ensuring that AI systems do what we, as humans, actually want them to do, is one of the biggest open research challenges in AI alignment and safety. My research seeks to directly address this challenge by enabling AI systems to interact with humans to learn aligned and robust behaviors. The way in which robots and other AI systems behave is often the result of optimizing a reward function. However, manually designing good reward functions is highly challenging and error prone, even for domain experts. Consider trying to write down a reward function that describes good driving behavior or how you like your bed made in the morning. While reward functions for these tasks are difficult to manually specify, human feedback in the form of demonstrations or preferences are often much easier to obtain. However, human data is often difficult to interpret, due to ambiguity and noise. Thus, it is critical that AI systems take into account epistemic uncertainty over the human's true intent. My talk will give an overview of my lab's progress along the following fundamental research areas: (1) efficiently maintaining uncertainty over human intent, (2) directly optimizing behavior to be robust to uncertainty over human intent, and (3) actively querying for additional human input to reduce uncertainty over human intent. more »

Award ID(s):: 2416761

PAR ID:: 10608128

Author(s) / Creator(s):: Brown, Daniel S

Publisher / Repository:: Association for the Advancement of Artificial Intelligence

Date Published:: 2025-04-11

Journal Name:: Proceedings of the AAAI Conference on Artificial Intelligence

Volume:: 39

Issue:: 27

ISSN:: 2159-5399

Page Range / eLocation ID:: 28704 to 28704

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on April 11, 2026
Journal Article:
https://doi.org/10.1609/aaai.v39i27.35099

More Like this