Language understanding involves processing text with both the grammatical and 2 common-sense contexts of the text fragments. The text “I went to the grocery store 3 and brought home a car” requires both the grammatical context (syntactic) and 4 common-sense context (semantic) to capture the oddity in the sentence. Contex5 tualized text representations learned by Language Models (LMs) are expected to 6 capture a variety of syntactic and semantic contexts from large amounts of training 7 data corpora. Recent work such as ERNIE has shown that infusing the knowl8 edge contexts, where they are available in LMs, results in significant performance 9 gains on General Language Understanding (GLUE) benchmark tasks. However, 10 to our knowledge, no knowledge-aware model has attempted to infuse knowledge 11 through top-down semantics-driven syntactic processing (Eg: Common-sense to 12 Grammatical) and directly operated on the attention mechanism that LMs leverage 13 to learn the data context. We propose a learning framework Top-Down Language 14 Representation (TDLR) to infuse common-sense semantics into LMs. In our 15 implementation, we build on BERT for its rich syntactic knowledge and use the 16 knowledge graphs ConceptNet and WordNet to infuse semantic knowledge.
more »
« less
Views Are My Own, but Also Yours: Benchmarking Theory of Mind Using Common Ground
Evaluating the theory of mind (ToM) capabilities of language models (LMs) has recently received a great deal of attention. However, many existing benchmarks rely on synthetic data, which risks misaligning the resulting experiments with human behavior. We introduce the first ToM dataset based on naturally occurring spoken dialogs, Common-ToM, and show that LMs struggle to demonstrate ToM. We then show that integrating a simple, explicit representation of beliefs improves LM performance on Common-ToM.
more »
« less
- Award ID(s):
- 2125295
- PAR ID:
- 10537323
- Publisher / Repository:
- Association for Computational Linguistics
- Date Published:
- Page Range / eLocation ID:
- 14815 to 14823
- Format(s):
- Medium: X
- Location:
- Bangkok, Thailand and virtual meeting
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
When reading narratives, human readers rely on their Theory of Mind (ToM) to infer not only what the characters know from their utterances, but also whether characters are likely to share common ground. As in human conversation, such decisions are not infallible but probabilistic, based on the evidence available in the narrative. By responding on a scale (rather than Yes/No), humans can indicate commitment to their inferences about what characters know (ToM). We use two prompting approaches to explore (i) how well LLM judgments align with human judgments, and (ii) how well LLMs infer the author’s intent from utterances intended to project knowledge in narratives.more » « less
-
Large Language Models (LLMs) have generated considerable interest and debate regarding their potential emergence of Theory of Mind (ToM). Several recent inquiries reveal a lack of robust ToM in these models and pose a pressing demand to develop new benchmarks, as current ones primarily focus on different aspects of ToM and are prone to shortcuts and data leakage. In this position paper, we seek to answer two road-blocking questions: (1) How can we taxonomize a holistic landscape of machine ToM? (2) What is a more effective evaluation protocol for machine ToM? Following psychological studies, we taxonomize machine ToM into 7 mental state categories and delineate existing benchmarks to identify under-explored aspects of ToM. We argue for a holistic and situated evaluation of ToM to break ToM into individual components and treat LLMs as an agent who is physically situated in environments and socially situated in interactions with humans. Such situated evaluation provides a more comprehensive assessment of mental states and potentially mitigates the risk of shortcuts and data leakage. We further present a pilot study in a grid world setup as a proof of concept. We hope this position paper can facilitate future research to integrate ToM with LLMs and offer an intuitive means for researchers to better position their work in the landscape of ToM.more » « less
-
“Theory of Mind” (ToM; people’s ability to infer and use information about others’ mental states) varies across cultures. In four studies ( N = 881), including two preregistered replications, we show that social class predicts performance on ToM tasks. In Studies 1A and 1B, we provide new evidence for a relationship between social class and emotion perception: Higher-class individuals performed more poorly than their lower-class counterparts on the Reading the Mind in the Eyes Test, which has participants infer the emotional states of targets from images of their eyes. In Studies 2A and 2B, we provide the first evidence that social class predicts visual perspective taking: Higher-class individuals made more errors than lower-class individuals in the Director Task, which requires participants to assume the visual perspective of another person. Potential mechanisms linking social class to performance in different ToM domains, as well as implications for deficiency-centered perspectives on low social class, are discussed.more » « less
-
Human-robot interaction has played an increasingly significant role in more recent research involving the Theory of Mind (ToM). As the use of robot facilitators increases, questions arise regarding the implications of their involvement in a research setting. This work addresses the effects of a humanoid robot facilitator in a ToM assessment. This paper analyzes subjects’ performances on tasks meant to test ToM as those tasks are delivered by human or robot facilitators. Various modalities of data were collected: performance on ToM tasks, subjects’ perceptions of the robot, results from a ToM survey, and response duration. This paper highlights the effects of human-robot interactions in ToM assessments, which ultimately leads to a discussion on the effectiveness of using robot facilitators in future human-subject research.more » « less
An official website of the United States government

