Views Are My Own, but Also Yours: Benchmarking Theory of Mind Using Common Ground

Soubki, Adil; Murzaku, John; Yousefi_Jordehi, Arash; Zeng, Peter; Markowska, Magdalena; Mirroshandel, Seyed Abolghasem; Rambow, Owen

doi:10.18653/v1/2024.findings-acl.880

Citation Details

Views Are My Own, but Also Yours: Benchmarking Theory of Mind Using Common Ground

Evaluating the theory of mind (ToM) capabilities of language models (LMs) has recently received a great deal of attention. However, many existing benchmarks rely on synthetic data, which risks misaligning the resulting experiments with human behavior. We introduce the first ToM dataset based on naturally occurring spoken dialogs, Common-ToM, and show that LMs struggle to demonstrate ToM. We then show that integrating a simple, explicit representation of beliefs improves LM performance on Common-ToM. more »

Award ID(s):: 2125295

PAR ID:: 10537323

Author(s) / Creator(s):: Soubki, Adil; Murzaku, John; Yousefi_Jordehi, Arash; Zeng, Peter; Markowska, Magdalena; Mirroshandel, Seyed Abolghasem; Rambow, Owen

Publisher / Repository:: Association for Computational Linguistics

Date Published:: 2024-01-01

Page Range / eLocation ID:: 14815 to 14823

Format(s):: Medium: X

Location:: Bangkok, Thailand and virtual meeting

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.18653/v1/2024.findings-acl.880

More Like this