Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey

Winata, Genta Indra; Zhao, Hanyang; Das, Anirban; Tang, Wenpin; Yao, David D; Zhang, Shi-Xiong; Sahu, Sambit

doi:10.1613/jair.1.17541

Citation Details

This content will become publicly available on January 6, 2026

Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey

Preference tuning is a crucial process for aligning deep generative models with human preferences. This survey offers a thorough overview of recent advancements in preference tuning and the integration of human feedback. The paper is organized into three main sections: 1) introduction and preliminaries: an introduction to reinforcement learning frameworks, preference tuning tasks, models, and datasets across various modalities: language, speech, and vision, as well as different policy approaches, 2) in-depth exploration of each preference tuning approach: a detailed analysis of the methods used in preference tuning, and 3) applications, discussion, and future directions: an exploration of the applications of preference tuning in downstream tasks, including evaluation methods for different modalities, and an outlook on future research directions. Our objective is to present the latest methodologies in preference tuning and model alignment, enhancing the understanding of this field for researchers and practitioners. We hope to encourage further engagement and innovation in this area. Additionally, we provide a GitHub link https://github.com/hanyang1999/Preference-Tuning-with-Human-Feedback. more »

Award ID(s):: 2206038

PAR ID:: 10612825

Author(s) / Creator(s):: Winata, Genta Indra; Zhao, Hanyang; Das, Anirban; Tang, Wenpin; Yao, David D; Zhang, Shi-Xiong; Sahu, Sambit

Publisher / Repository:: AI Access Foundation

Date Published:: 2025-01-06

Journal Name:: Journal of Artificial Intelligence Research

Volume:: 82

ISSN:: 1076-9757

Page Range / eLocation ID:: 2595 to 2661

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on January 6, 2026
Journal Article:
https://doi.org/10.1613/jair.1.17541

More Like this