skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Integrating emotional expressions with utterances in pragmatic inference
Human communication involves far more than words; speak- ers’ utterances are often accompanied by various kinds of emo- tional expressions. How do listeners represent and integrate these distinct sources of information to make communicative inferences? We first show that people, as listeners, integrate both verbal and emotional information when inferring true states of the world and others’ communicative goals, and then present computational models that formalize these inferences by considering different ways in which these signals might be generated. Results suggest that while listeners understand that utterances and emotional expressions are generated by a bal- ance of speakers’ informational and social goals, they addi- tionally consider the possibility that emotional expressions are noncommunicative signals that directly reflect the speaker’s in- ternal states. These results are consistent with the predictions of a probabilistic model that integrates goal inferences with linguistic and emotional signals, moving us towards a more complete formal theory of human communicative reasoning.  more » « less
Award ID(s):
1911790
PAR ID:
10279466
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the Annual Conference of the Cognitive Science Society
ISSN:
1069-7977
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The majority of research on infants’ and children’s understanding of emotional expressions has focused on their abilities to use emotional expressions to infer how other people feel. However, an emerging body of work suggests that emotional expressions support rich, powerful inferences not just about emotional states but also about other unobserved states, such as hidden events in the physical world and mental states of other people (e.g., beliefs and desires). Here we argue that infants and children harness others’ emotional expressions as a source of information for learning about the physical and social world broadly. This “emotion as information” framework integrates affective, developmental, and computational cognitive sciences, extending the scope of signals that count as “information” in early learning. 
    more » « less
  2. Children do not learn language from passively analyzing correlations between language and observations, but from interaction with caregivers or peers. The non-nativist approach claims that the main driver of language learning should be to achieve communicative goals. Imitation, on the other hand, is another natural desire that many argue influences language learning. However, there are still gaps in the research on what roles communicative goals and imitating linguistic input play in language acquisition, due to the difficulty of performing comprehensive experiments with human learners. In this paper, we propose a computational framework using simulated experiments that allows us to compare the roles of the two drivers. Specifically, we simulate a two-way communication game between a speaker, corresponding to a language learner, and a listener, corresponding to a caregiver or teacher. The speaker's communicative goals are modeled as rewards for successful completion of a referential game, and imitation is performed by mimicking feedback from the listener. The listener adaptively chooses to give feedback and makes choices based on the speaker's utterances. With empirical results on naturalistic visual and language data, we find that communicative goals play an important role in driving language learning, whereas imitation accelerates the learning process. We also find that (1) models trained with communicative goals tend to use minimal vocabulary and utterances and overextend them to concepts outside the original word meanings; (2) the strategy with which the listener provides feedback also influences the learning results and speed. Code and data for replicating the experiments are available (https://bit.ly/interactgym) to spur future research on models for computational studies of language learning. 
    more » « less
  3. We present a game-theoretic model of pragmatics that we call ReCo (for Regularized Conventions). This model formulates pragmatic communication as a game in which players are rewarded for communicating successfully and penalized for deviating from a shared, “default” semantics. As a result, players assign utterances context-dependent meanings that jointly optimize communicative success and naturalness with respect to speakers’ and listeners’ background knowledge of language. By using established game-theoretic tools to compute equilibrium strategies for this game, we obtain principled pragmatic language generation procedures with formal guarantees of communicative success. Across several datasets capturing real and idealized human judgments about pragmatic implicature, ReCo matches, or slightly improves upon, predictions made by Iterated Best Response and Rational Speech Acts models of language understanding. 
    more » « less
  4. Multimodal dialogue involving multiple participants presents complex computational challenges, primarily due to the rich interplay of diverse communicative modalities including speech, gesture, action, and gaze. These modalities interact in complex ways that traditional dialogue systems often struggle to accurately track and interpret. To address these challenges, we extend the textual enrichment strategy of Dense Paraphrasing (DP), by translating each nonverbal modality into linguistic expressions. By normalizing multimodal information into a language-based form, we hope to both simplify the representation for and enhance the computational understanding of situated dialogues. We show the effectiveness of the dense paraphrased language form by evaluating instruction-tuned Large Language Models (LLMs) against the Common Ground Tracking (CGT) problem using a publicly available collaborative problem-solving dialogue dataset. Instead of using multimodal LLMs, the dense paraphrasing technique represents the dialogue information from multiple modalities in a compact and structured machine-readable text format that can be directly processed by the language-only models. We leverage the capability of LLMs to transform machine-readable paraphrases into human-readable paraphrases, and show that this process can further improve the result on the CGT task. Overall, the results show that augmenting the context with dense paraphrasing effectively facilitates the LLMs' alignment of information from multiple modalities, and in turn largely improves the performance of common ground reasoning over the baselines. Our proposed pipeline with original utterances as input context already achieves comparable results to the baseline that utilized decontextualized utterances which contain rich coreference information. When also using the decontextualized input, our pipeline largely improves the performance of common ground reasoning over the baselines. We discuss the potential of DP to create a robust model that can effectively interpret and integrate the subtleties of multimodal communication, thereby improving dialogue system performance in real-world settings. 
    more » « less
  5. Emotion recognition in social situations is a complex task that requires integrating information from both facial expressions and the situational context. While traditional approaches to automatic emotion recognition have focused on decontextualized signals, recent research emphasizes the importance of context in shaping emotion perceptions. This paper contributes to the emerging field of context-based emotion recognition by leveraging psychological theories of human emotion perception to inform the design of automated methods. We propose an approach that combines emotion recognition methods with Bayesian Cue Integration (BCI) to integrate emotion inferences from decontextualized facial expressions and contextual knowledge inferred via Large-language Models. We test this approach in the context of interpreting facial expressions during a social task, the prisoner’s dilemma. Our results provide clear support for BCI across a range of automatic emotion recognition methods. The best automated method achieved results comparable to human observers, suggesting the potential for this approach to advance the field of affective computing. 
    more » « less