skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on December 11, 2025

Title: Aligning LLM Agents by Learning Latent Preference from User Edits
We study interactive learning of LLM-based language agents based on user edits made to the agent's output. In a typical setting such as writing assistants, the user interacts with a language agent to generate a response given a context, and may optionally edit the agent response to personalize it based on their latent preference, in addition to improving the correctness. The edit feedback is naturally generated, making it a suitable candidate for improving the agent's alignment with the user's preference, and for reducing the cost of user edits over time. We propose a learning framework, PRELUDE that infers a description of the user's latent preference based on historic edit data. The inferred user preference descriptions are used to define prompts for generating responses in the future. This avoids fine-tuning the agent, which is costly, challenging to scale with the number of users, and may even degrade its performance on other tasks. Furthermore, learning descriptive preference improves interpretability, allowing the user to view and modify the learned preference. However, user preference can be complex, subtle, and vary based on context, making it challenging to learn. To address this, we propose a simple yet effective algorithm named CIPHER that leverages the LLM to infer the user preference for a given context based on user edits. In the future, CIPHER retrieves inferred preferences from the k-closest contexts in the history, and forms an aggregate preference for response generation. We introduce two interactive environments -- summarization and email writing, and use a GPT-4 simulated user for evaluation. On both tasks, CIPHER outperforms several baselines by achieving the lowest edit distance cost while only having a small overhead in LLM query cost. Our analysis reports that user preferences learned by CIPHER show significant similarity to the ground truth latent preferences.  more » « less
Award ID(s):
1901030
PAR ID:
10556480
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
NeurIPS
Date Published:
Format(s):
Medium: X
Location:
NeurIPS
Sponsoring Org:
National Science Foundation
More Like this
  1. In autonomous robot navigation, terrain cost assignment is typically performed using a semantics-based paradigm in which terrain is first labeled using a pre-trained semantic classifier and costs are then assigned according to a user-defined mapping between label and cost. While this approach is rapidly adaptable to changing user preferences, only preferences over the types of terrain that are already known by the semantic classifier can be expressed. In this paper, we hypothesize that a machine-learning-based alternative to the semantics-based paradigm above will allow for rapid cost assignment adaptation to preferences expressed over new terrains at deployment time without the need for additional training. To investigate this hypothesis, we introduce and study PACER, a novel approach to costmap generation that accepts as input a single birds-eye view (BEV) image of the surrounding area along with a user-specified preference context and generates a corresponding BEV costmap that aligns with the preference context. Using both real and synthetic data along with a combination of proposed training tasks, we find that PACER is able to adapt quickly to new user preferences while also exhibiting better generalization to novel terrains compared to both semantics-based and representation-learning approaches. 
    more » « less
  2. This paper investigates simultaneous preference and metric learning from a crowd of respondents. A set of items represented by d -dimensional feature vectors and paired comparisons of the form item i is preferable to item j '' made by each user is given. Our model jointly learns a distance metric that characterizes the crowd's general measure of item similarities along with a latent ideal point for each user reflecting their individual preferences. This model has the flexibility to capture individual preferences, while enjoying a metric learning sample cost that is amortized over the crowd. We first study this problem in a noiseless, continuous response setting (i.e., responses equal to differences of item distances) to understand the fundamental limits of learning. Next, we establish prediction error guarantees for noisy, binary measurements such as may be collected from human respondents, and show how the sample complexity improves when the underlying metric is low-rank. Finally, we establish recovery guarantees under assumptions on the response distribution. We demonstrate the performance of our model on both simulated data and on a dataset of color preference judgements across a large number of users. 
    more » « less
  3. Today, face editing is widely used to refine/alter photos in both professional and recreational settings. Yet it is also used to modify (and repost) existing online photos for cyberbullying. Our work considers an important open question: 'How can we support the collaborative use of face editing on social platforms while protecting against unacceptable edits and reposts by others?' This is challenging because, as our user study shows, users vary widely in their definition of what edits are (un)acceptable. Any global filter policy deployed by social platforms is unlikely to address the needs of all users, but hinders social interactions enabled by photo editing. Instead, we argue that face edit protection policies should be implemented by social platforms based on individual user preferences. When posting an original photo online, a user can choose to specify the types of face edits (dis)allowed on the photo. Social platforms use these per-photo edit policies to moderate future photo uploads, i.e., edited photos containing modifications that violate the original photo's policy are either blocked or shelved for user approval. Realizing this personalized protection, however, faces two immediate challenges: (1) how to accurately recognize specific modifications, if any, contained in a photo; and (2) how to associate an edited photo with its original photo (and thus the edit policy). We show that these challenges can be addressed by combining highly efficient hashing based image search and scalable semantic image comparison, and build a prototype protector (Alethia) covering nine edit types. Evaluations using IRB-approved user studies and data-driven experiments (on 839K face photos) show that Alethia accurately recognizes edited photos that violate user policies and induces a feeling of protection to study participants. This demonstrates the initial feasibility of personalized face edit protection. We also discuss current limitations and future directions to push the concept forward. 
    more » « less
  4. Explanations of AI Agents' actions are considered to be an important factor in improving users' trust in the decisions made by autonomous AI systems. However, as these autonomous systems evolve from reactive, i.e., acting on user input, to proactive, i.e., acting without requiring user intervention, there is a need to explore how the explanation for the actions of these agents should evolve. In this work, we explore the design of explanations through participatory design methods for a proactive auto-response messaging agent that can reduce perceived obligations and social pressure to respond quickly to incoming messages by providing unavailability-related context. We recruited 14 participants who worked in pairs during collaborative design sessions where they reasoned about the agent's design and actions. We qualitatively analyzed the data collected through these sessions and found that participants' reasoning about agent actions led them to speculate heavily on its design. These speculations significantly influenced participants' desire for explanations and the controls they sought to inform the agents' behavior. Our findings indicate a need to transform users' speculations into accurate mental models of agent design. Further, since the agent acts as a mediator in human-human communication, it is also necessary to account for social norms in its explanation design. Finally, user expertise in understanding their habits and behaviors allows the agent to learn from the user their preferences when justifying its actions. 
    more » « less
  5. null (Ed.)
    As the popularity of online travel platforms increases, users tend to make ad-hoc decisions on places to visit rather than preparing the detailed tour plans in advance. Under the situation of timeliness and uncertainty of users’ demand, how to integrate real-time context into a dynamic and personalized recommendations have become a key issue in travel recommender system. In this paper, by integrating the users’ historical preferences and real-time context, a location-aware recommender system called TRACE (Travel Reinforcement Recommendations Based on Location-Aware Context Extraction) is proposed. It captures users’ features based on location-aware context learning model, and makes dynamic recommendations based on reinforcement learning. Specifically, this research: (1) designs a travel reinforcing recommender system based on an Actor-Critic framework, which can dynamically track the user preference shifts and optimize the recommender system performance; (2) proposes a location-aware context learning model, which aims at extracting user context from real-time location and then calculating the impacts of nearby attractions on users’ preferences; and (3) conducts both offline and online experiments. Our proposed model achieves the best performance in both of the two experiments, which demonstrates that tracking the users’ preference shifts based on real-time location is valuable for improving the recommendation results. 
    more » « less