Learning Fairness from Demonstrations via Inverse Reinforcement Learning

Blandin, Jack; Kash, Ian

Citation Details

Defining fairness in algorithmic contexts is challenging, particularly when adapting to new domains. Our research introduces a novel method for learning and applying group fairness preferences across different classification domains, without the need for manual fine-tuning. Utilizing concepts from inverse reinforcement learning (IRL), our approach enables the extraction and application of fairness preferences from human experts or established algorithms. We propose the first technique for using IRL to recover and adapt group fairness preferences to new domains, offering a low-touch solution for implementing fair classifiers in settings where expert-established fairness tradeoffs are not yet defined. more »

Award ID(s):: 1939743

PAR ID:: 10586335

Author(s) / Creator(s):: Blandin, Jack; Kash, Ian

Publisher / Repository:: The 2024 ACM Conference on Fairness, Accountability, and Transparency

Date Published:: 2024-06-03

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this