skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on July 13, 2026

Title: SPRI: Aligning Large Language Models with Context-Situated Principles
Aligning Large Language Models to integrate and reflect human values, especially for tasks that demand intricate human oversight, is arduous since it is resource-intensive and time-consuming to depend on human expertise for context-specific guidance. Prior work has utilized predefined sets of rules or principles to steer the behavior of models (Bai et al., 2022; Sun et al., 2023). However, these principles tend to be generic, making it challenging to adapt them to each individual input query or context. In this work, we present Situated-PRInciples (SPRI), a framework requiring minimal or no human effort that is designed to automatically generate guiding principles in real-time for each input query and utilize them to align each response. We evaluate SPRI on three tasks, and show that 1) SPRI can derive principles in a complex domain-specific task that leads to on-par performance as expert-crafted ones; 2) SPRI-generated principles lead to instance-specific rubrics that outperform prior LLM-as-a-judge frameworks; 3) using SPRI to generate synthetic SFT data leads to substantial improvement on truthfulness.  more » « less
Award ID(s):
2107524
PAR ID:
10635386
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Proceedings of the Forty-Second International Conference on Machine Learning (ICML 2025)
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The visualization community has seen a rise in the adoption of user studies. Empirical user studies systematically test the assumptions that we make about how visualizations can help or hinder viewers’ performance of tasks. Although the increase in user studies is encouraging, it is vital that research on human reasoning with visualizations be grounded in an understanding of how the mind functions. Previously, there were no sufficient models that illustrate the process of decision-making with visualizations. However, Padilla et al. [41] recently proposed an integrative model for decision-making with visualizations, which expands on modern theories of visualization cognition and decision-making. In this paper, we provide insights into how cognitive models can accelerate innovation, improve validity, and facilitate replication efforts, which have yet to be thoroughly discussed in the visualization community. To do this, we offer a compact overview of the cognitive science of decision-making with visualizations for the visualization community, using the Padilla et al. [41] cognitive model as a guiding framework. By detailing examples of visualization research that illustrate each component of the model, this paper offers novel insights into how visualization researchers can utilize a cognitive framework to guide their user studies. We provide practical examples of each component of the model from empirical studies of visualizations, along with visualization implications of each cognitive process, which have not been directly addressed in prior work. Finally, this work offers a case study in utilizing an understanding of human cognition to generate a novel solution to a visualization reasoning bias in the context of hurricane forecast track visualizations. 
    more » « less
  2. null (Ed.)
    This work presents ideation and preliminary results of using contextual information and information of the objects present in the scene to query applicable social navigation rules for the sensed context. Prior work in socially-Aware Navigation (SAN) shows its importance in human-robot interaction as it improves the interaction quality, safety and comfort of the interacting partner. In this work, we are interested in automatic detection of social rules in SAN and we present three major components of our method, namely: a Convolutional Neural Network-based context classifier that can autonomously perceive contextual information using camera input; a YOLO-based object detection to localize objects with a scene; and a knowledge base of social rules relationships with the concepts to query them using both contextual and detected objects in the scene. Our preliminary results suggest that our approach can observe an on-going interaction, given an image input, and use that information to query the social navigation rules required in that particular context. 
    more » « less
  3. null (Ed.)
    Low-dimensional node embeddings play a key role in analyzing graph datasets. However, little work studies exactly what information is encoded by popular embedding methods, and how this information correlates with performance in downstream machine learning tasks. We tackle this question by studying whether embeddings can be inverted to (approximately) recover the graph used to generate them. Focusing on a variant of the popular DeepWalk method (Perozzi et al., 2014; Qiu et al., 2018), we present algorithms for accurate embedding inversion - i.e., from the low-dimensional embedding of a graph G, we can find a graph H with a very similar embedding. We perform numerous experiments on real-world networks, observing that significant information about G, such as specific edges and bulk properties like triangle density, is often lost in H. However, community structure is often preserved or even enhanced. Our findings are a step towards a more rigorous understanding of exactly what information embeddings encode about the input graph, and why this information is useful for learning tasks. 
    more » « less
  4. Automatic discourse processing is bottlenecked by data: current discourse formalisms pose highly demanding annotation tasks involving large taxonomies of discourse relations, making them inaccessible to lay annotators. This work instead adopts the linguistic framework of Questions Under Discussion (QUD) for discourse analysis and seeks to derive QUD structures automatically. QUD views each sentence as an answer to a question triggered in prior context; thus, we characterize relationships between sentences as free-form questions, in contrast to exhaustive fine-grained taxonomies. We develop the first-of-its-kind QUD parser that derives a dependency structure of questions over full documents, trained using a large, crowdsourced question-answering dataset DCQA (Ko et al., 2022). Human evaluation results show that QUD dependency parsing is possible for language models trained with this crowdsourced, generalizable annotation scheme. We illustrate how our QUD structure is distinct from RST trees, and demonstrate the utility of QUD analysis in the context of document simplification. Our findings show that QUD parsing is an appealing alternative for automatic discourse processing. 
    more » « less
  5. Large Language Models (LLMs) have demonstrated remarkable capabilities in performing complex tasks. Moreover, recent research has shown that incorporating human-annotated rationales (e.g., Chain-of-Thought prompting) during in-context learning can significantly enhance the performance of these models, particularly on tasks that require reasoning capabilities. However, incorporating such rationales poses challenges in terms of scalability as this requires a high degree of human involvement. In this work, we present a novel framework, Amplifying Model Performance by Leveraging In-Context Learning with Post Hoc Explanations (AMPLIFY), which addresses the aforementioned challenges by automating the process of rationale generation. To this end, we leverage post hoc explanation methods which output attribution scores (explanations) capturing the influence of each of the input features on model predictions. More specifically, we construct automated natural language rationales that embed insights from post hoc explanations to provide corrective signals to LLMs. Extensive experimentation with real-world datasets demonstrates that our framework, AMPLIFY, leads to prediction accuracy improvements of about 10-25% over a wide range of tasks, including those where prior approaches which rely on human-annotated rationales such as Chain-of-Thought prompting fall short. Our work makes one of the first attempts at highlighting the potential of post hoc explanations as valuable tools for enhancing the effectiveness of LLMs. Furthermore, we conduct additional empirical analyses and ablation studies to demonstrate the impact of each of the components of AMPLIFY, which, in turn, lead to critical insights for refining in context learning. 
    more » « less