skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Quantifying Intimacy in Language
Intimacy is a fundamental aspect of how we relate to others in social settings. Language encodes the social information of intimacy through both topics and other more subtle cues (such as linguistic hedging and swearing). Here, we introduce a new computational framework for studying expressions of the intimacy in language with an accompanying dataset and deep learning model for accurately predicting the intimacy level of questions (Pearson r = 0.87). Through analyzing a dataset of 80.5M questions across social media, books, and films, we show that individuals employ interpersonal pragmatic moves in their language to align their intimacy with social settings. Then, in three studies, we further demonstrate how individuals modulate their intimacy to match social norms around gender, social distance, and audience, each validating key findings from studies in social psychology. Our work demonstrates that intimacy is a pervasive and impactful social dimension of language.  more » « less
Award ID(s):
1850221 2007251
PAR ID:
10226972
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Page Range / eLocation ID:
5307 to 5326
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Offering condolence is a natural reaction to hearing someone’s distress. Individuals frequently express distress in social media, where some communities can provide support. However, not all condolence is equal—trite responses offer little actual support despite their good intentions. Here, we develop computational tools to create a massive dataset of 11.4M expressions of distress and 2.8M corresponding offerings of condolence in order to examine the dynamics of condolence online. Our study reveals widespread disparity in what types of distress receive supportive condolence rather than just engagement. Building on studies from social psychology, we analyze the language of condolence and develop a new dataset for quantifying the empathy in a condolence using appraisal theory. Finally, we demonstrate that the features of condolence individuals find most helpful online differ substantially in their features from those seen in interpersonal settings. 
    more » « less
  2. Given that most cues exchanged during a social interaction are nonverbal (e.g., facial expressions, hand gestures, body language), individuals who are blind are at a social disadvantage compared to their sighted peers. Very little work has explored sensory augmentation in the context of social assistive aids for individuals who are blind. The purpose of this study is to explore the following questions related to visual-to-vibrotactile mapping of facial action units (the building blocks of facial expressions): (1) How well can individuals who are blind recognize tactile facial action units compared to those who are sighted? (2) How well can individuals who are blind recognize emotions from tactile facial action units compared to those who are sighted? These questions are explored in a preliminary pilot test using absolute identification tasks in which participants learn and recognize vibrotactile stimulations presented through the Haptic Chair, a custom vibrotactile display embedded on the back of a chair. Study results show that individuals who are blind are able to recognize tactile facial action units as well as those who are sighted. These results hint at the potential for tactile facial action units to augment and expand access to social interactions for individuals who are blind. 
    more » « less
  3. Individuals experiencing unexpected distressing events, shocks, often rely on their social network for support. While prior work has shown how social networks respond to shocks, these studies usually treat all ties equally, despite differences in the support provided by different social relationships. Here, we conduct a computational analysis on Twitter that examines how responses to online shocks differ by the relationship type of a user dyad. We introduce a new dataset of over 13K instances of individuals' self-reporting shock events on Twitter and construct networks of relationship-labeled dyadic interactions around these events. By examining behaviors across 110K replies to shocked users in a pseudo-causal analysis, we demonstrate relationship-specific patterns in response levels and topic shifts. We also show that while well-established social dimensions of closeness such as tie strength and structural embeddedness contribute to shock responsiveness, the degree of impact is highly dependent on relationship and shock types. Our findings indicate that social relationships contain highly distinctive characteristics in network interactions, and that relationship-specific behaviors in online shock responses are unique from those of offline settings. 
    more » « less
  4. Previous studies, both in psychology and linguistics, have shown that individuals with mental illnesses show deviations from normal language use, that these differences can be used to make predictions, and used as a diagnostic tool. Recent studies have shown that machine learning can be used to predict people with mental illnesses based on their writing. However, little attention is paid to the interpretability of the machine learning models. In this talk we will describe our analysis of the machine learning models, the different language patterns that distinguish individuals having mental illnesses from a control group, and the associated privacy concerns. We use a dataset of Tweets that are collected from users who reported a diagnosis of a mental illnesses on Twitter. Given the self-reported nature of the dataset, it is possible that some of these individuals are actively talking about their mental illness on social media. We investigated if the machine learning models are detecting the active mentions of the mental illness or if they are detecting more complex language patterns. We then conducted a feature analysis by creating feature vectors using word unigrams, part of speech tags and word clusters and used feature importance measures and statistical methods to identify important features. This analysis serves two purposes: to understand the machine learning model, and to discover language patterns that would help in identifying people with mental illnesses. Finally, we conducted a qualitative analysis of the misclassifications to understand the potential causes for the misclassifications. 
    more » « less
  5. Text-to-image generative models have achieved unprecedented success in generating high-quality images based on natural language descriptions. However, it is shown that these models tend to favor specific social groups when prompted with neutral text descriptions (e.g., ‘a photo of a lawyer’). Following Zhao et al. (2021), we study the effect on the diversity of the generated images when adding ethical intervention that supports equitable judgment (e.g., ‘if all individuals can be a lawyer irrespective of their gender’) in the input prompts. To this end, we introduce an Ethical NaTural Language Interventions in Text-to-Image GENeration (ENTIGEN) benchmark dataset to evaluate the change in image generations conditional on ethical interventions across three social axes – gender, skin color, and culture. Through CLIP-based and human evaluation on minDALL.E, DALL.E-mini and Stable Diffusion, we find that the model generations cover diverse social groups while preserving the image quality. In some cases, the generations would be anti-stereotypical (e.g., models tend to create images with individuals that are perceived as man when fed with prompts about makeup) in the presence of ethical intervention. Preliminary studies indicate that a large change in the model predictions is triggered by certain phrases such as ‘irrespective of gender’ in the context of gender bias in the ethical interventions. We release code and annotated data at https://github.com/Hritikbansal/entigen_emnlp. 
    more » « less