Large language models (LLMs) struggle in social science domains, where critical thinking and human-level inference are crucial. In this work, we propose a multi-agent social reasoning framework that leverages the generative and reasoning capabilities of LLMs to generate and evaluate reasons from multiple perspectives grounded in social science theories, and construct a factor graph for inference. Experimental results on understanding power dynamics in conversations show that our method outperforms standard prompting baselines, demonstrating its potential for tackling hard Computational Social Science (CSS) tasks.
more »
« less
This content will become publicly available on January 1, 2027
CLASH: Evaluating Language Models on Judging High-Stakes Dilemmas from Multiple Perspectives
Navigating dilemmas involving conflicting values is challenging even for humans in high-stakes domains, let alone for AI, yet prior work has been limited to everyday scenarios. To close this gap, we introduce CLASH (Character perspective-based LLM Assessments in Situations with High-stakes), a meticulously curated dataset consisting of 345 high-impact dilemmas along with 3,795 individual perspectives of diverse values. CLASH enables the study of critical yet underexplored aspects of value-based decision-making processes, including understanding of decision ambivalence and psychological discomfort as well as capturing the temporal shifts of values in the perspectives of characters. By benchmarking 14 non-thinking and thinking models, we uncover several key findings. (1) Even strong proprietary models, such as GPT-5 and Claude-4-Sonnet, struggle with ambivalent decisions, achieving only 24.06 and 51.01 accuracy. (2) Although LLMs reasonably predict psychological discomfort, they do not adequately comprehend perspectives involving value shifts. (3) Cognitive behaviors that are effective in the math-solving and game strategy domains do not transfer to value reasoning. Instead, new failure patterns emerge, including early commitment and overcommitment. (4) The steerability of LLMs towards a given value is significantly correlated with their value preferences. (5) Finally, LLMs exhibit greater steerability when reasoning from a third-party perspective, although certain values (e.g., safety) benefit uniquely from first-person framing.
more »
« less
- Award ID(s):
- 2127747
- PAR ID:
- 10657151
- Publisher / Repository:
- arXiv
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Process safety is at the heart of operation of many chemical processing companies. However, the Chemical Safety Board (CSB) has still documented over 800 investigations of process safety failures since the year 2000. While not all of these incidents were severe, some did lead to employee injuries or death and environmental harm. As a result, chemical engineering companies are increasingly dedicated to process safety through training programs and detailed vigilance as part of their operations practice. AIChE and OSHA also offer courses in process safety to help support the industry. These efforts illustrate the paramount importance that chemical engineering graduates have an appreciation and understanding of process safety as they transition from their degree program into industrial positions. Previous studies have shown that despite difficulties due to course load constraints, process safety has been incorporated into chemical engineering curriculum through either the addition of new courses, incorporation of the content within existing classes, or a combination of the two methods. A review performed in Process Safety Progress suggested that a key step for departments moving forward is to perform an assessment of the process safety culture within their institution in order to determine how faculty and students view process safety. An issue with completing this task is the lack of assessment tools that can be used to determine how students are developing their understanding of process safety decision making. This observation led to the development of the Engineering Process Safety Research Instrument (EPSRI). This instrument is modeled after the Defining Issues Test version 2 (DIT2) and the Engineering Ethical Reasoning Instrument (EERI). Similar to these instruments, the EPSRI provides dilemmas, three decisions, and 12 additional considerations that individuals must rate based on their relative importance to their decision making process. The dilemmas developed in the EPSRI are based on case studies and investigations from process safety failures that have occurred in industry to provide a realistic context for the decision making decisions that engineers may be faced with upon employment. The considerations provided after the scenario are derived to reflect pre-conventional, conventional, and post-conventional decision making thinking as described by Kohlberg’s Moral Development Theory. Pre-conventional decision making thinking focuses particularly on what is right/wrong or good/bad from an individual level, whereas post-conventional thinking seeks to determine what is correct from moral and value perspectives at the society level. This WIP paper describes the content validity study conducted while developing the EPSRI. Dilemmas were examined by context experts including professionals in the process industry, chemical engineering departments, and learning sciences field. Content experts reviewed the dilemmas and determined whether they represented accurate examples of process safety decision making that individuals may face in real-world engineering settings. The experts also reviewed the 12 considerations for each dilemma for their accuracy in capturing pre-conventional, conventional and post-conventional thinking. This work represents the first step in the overall instrument validation that will take place over the next academic year.more » « less
-
Process safety is at the heart of operation of many chemical processing companies. However, the Chemical Safety Board (CSB) has still documented over 800 investigations of process safety failures since the year 2000. While not all of these incidents were severe, some did lead to employee injuries or death and environmental harm. As a result, chemical engineering companies are increasingly dedicated to process safety through training programs and detailed vigilance as part of their operations practice. AIChE and OSHA also offer courses in process safety to help support the industry. These efforts illustrate the paramount importance that chemical engineering graduates have an appreciation and understanding of process safety as they transition from their degree program into industrial positions. Previous studies have shown that despite difficulties due to course load constraints, process safety has been incorporated into chemical engineering curriculum through either the addition of new courses, incorporation of the content within existing classes, or a combination of the two methods. A review performed in Process Safety Progress suggested that a key step for departments moving forward is to perform an assessment of the process safety culture within their institution in order to determine how faculty and students view process safety. An issue with completing this task is the lack of assessment tools that can be used to determine how students are developing their understanding of process safety decision making. This observation led to the development of the Engineering Process Safety Research Instrument (EPSRI). This instrument is modeled after the Defining Issues Test version 2 (DIT2) and the Engineering Ethical Reasoning Instrument (EERI). Similar to these instruments, the EPSRI provides dilemmas, three decisions, and 12 additional considerations that individuals must rate based on their relative importance to their decision making process. The dilemmas developed in the EPSRI are based on case studies and investigations from process safety failures that have occurred in industry to provide a realistic context for the decision making decisions that engineers may be faced with upon employment. The considerations provided after the scenario are derived to reflect pre-conventional, conventional, and post-conventional decision making thinking as described by Kohlberg’s Moral Development Theory. Pre-conventional decision making thinking focuses particularly on what is right/wrong or good/bad from an individual level, whereas post-conventional thinking seeks to determine what is correct from moral and value perspectives at the society level. This WIP paper describes the content validity study conducted while developing the EPSRI. Dilemmas were examined by context experts including professionals in the process industry, chemical engineering departments, and learning sciences field. Content experts reviewed the dilemmas and determined whether they represented accurate examples of process safety decision making that individuals may face in real-world engineering settings. The experts also reviewed the 12 considerations for each dilemma for their accuracy in capturing pre-conventional, conventional and post-conventional thinking. This work represents the first step in the overall instrument validation that will take place over the next academic year.more » « less
-
Process safety is at the heart of operation of many chemical processing companies. However, the Chemical Safety Board (CSB) has still documented over 800 investigations of process safety failures since the year 2000. While not all of these incidents were severe, some did lead to employee injuries or death and environmental harm. As a result, chemical engineering companies are increasingly dedicated to process safety through training programs and detailed vigilance as part of their operations practice. AIChE and OSHA also offer courses in process safety to help support the industry. These efforts illustrate the paramount importance that chemical engineering graduates have an appreciation and understanding of process safety as they transition from their degree program into industrial positions. Previous studies have shown that despite difficulties due to course load constraints, process safety has been incorporated into chemical engineering curriculum through either the addition of new courses, incorporation of the content within existing classes, or a combination of the two methods. A review performed in Process Safety Progress suggested that a key step for departments moving forward is to perform an assessment of the process safety culture within their institution in order to determine how faculty and students view process safety. An issue with completing this task is the lack of assessment tools that can be used to determine how students are developing their understanding of process safety decision making. This observation led to the development of the Engineering Process Safety Research Instrument (EPSRI). This instrument is modeled after the Defining Issues Test version 2 (DIT2) and the Engineering Ethical Reasoning Instrument (EERI). Similar to these instruments, the EPSRI provides dilemmas, three decisions, and 12 additional considerations that individuals must rate based on their relative importance to their decision making process. The dilemmas developed in the EPSRI are based on case studies and investigations from process safety failures that have occurred in industry to provide a realistic context for the decision making decisions that engineers may be faced with upon employment. The considerations provided after the scenario are derived to reflect pre-conventional, conventional, and post-conventional decision making thinking as described by Kohlberg’s Moral Development Theory. Pre-conventional decision making thinking focuses particularly on what is right/wrong or good/bad from an individual level, whereas post-conventional thinking seeks to determine what is correct from moral and value perspectives at the society level. This WIP paper describes the content validity study conducted while developing the EPSRI. Dilemmas were examined by context experts including professionals in the process industry, chemical engineering departments, and learning sciences field. Content experts reviewed the dilemmas and determined whether they represented accurate examples of process safety decision making that individuals may face in real-world engineering settings. The experts also reviewed the 12 considerations for each dilemma for their accuracy in capturing pre-conventional, conventional and post-conventional thinking. This work represents the first step in the overall instrument validation that will take place over the next academic year.more » « less
-
Process safety is at the heart of operation of many chemical processing companies. However, the Chemical Safety Board (CSB) has still documented over 800 investigations of process safety failures since the year 2000. While not all of these incidents were severe, some did lead to employee injuries or death and environmental harm. As a result, chemical engineering companies are increasingly dedicated to process safety through training programs and detailed vigilance as part of their operations practice. AIChE and OSHA also offer courses in process safety to help support the industry. These efforts illustrate the paramount importance that chemical engineering graduates have an appreciation and understanding of process safety as they transition from their degree program into industrial positions. Previous studies have shown that despite difficulties due to course load constraints, process safety has been incorporated into chemical engineering curriculum through either the addition of new courses, incorporation of the content within existing classes, or a combination of the two methods. A review performed in Process Safety Progress suggested that a key step for departments moving forward is to perform an assessment of the process safety culture within their institution in order to determine how faculty and students view process safety. An issue with completing this task is the lack of assessment tools that can be used to determine how students are developing their understanding of process safety decision making. This observation led to the development of the Engineering Process Safety Research Instrument (EPSRI). This instrument is modeled after the Defining Issues Test version 2 (DIT2) and the Engineering Ethical Reasoning Instrument (EERI). Similar to these instruments, the EPSRI provides dilemmas, three decisions, and 12 additional considerations that individuals must rate based on their relative importance to their decision making process. The dilemmas developed in the EPSRI are based on case studies and investigations from process safety failures that have occurred in industry to provide a realistic context for the decision making decisions that engineers may be faced with upon employment. The considerations provided after the scenario are derived to reflect pre-conventional, conventional, and post-conventional decision making thinking as described by Kohlberg’s Moral Development Theory. Pre-conventional decision making thinking focuses particularly on what is right/wrong or good/bad from an individual level, whereas post-conventional thinking seeks to determine what is correct from moral and value perspectives at the society level. This WIP paper describes the content validity study conducted while developing the EPSRI. Dilemmas were examined by context experts including professionals in the process industry, chemical engineering departments, and learning sciences field. Content experts reviewed the dilemmas and determined whether they represented accurate examples of process safety decision making that individuals may face in real-world engineering settings. The experts also reviewed the 12 considerations for each dilemma for their accuracy in capturing pre-conventional, conventional and post-conventional thinking. This work represents the first step in the overall instrument validation that will take place over the next academic year.more » « less
An official website of the United States government
