Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Topics in conversations depend in part on the type of interpersonal relationship between speakers, such as friendship, kinship, or romance. Identifying these relationships can provide a rich description of how individuals communicate and reveal how relationships influence the way people share information. Using a dataset of more than 9.6M dyads of Twitter users, we show how relationship types influence language use, topic diversity, communication frequencies, and diurnal patterns of conversations. These differences can be used to predict the relationship between two users, with the best predictive model achieving a macro F1 score of 0.70. We also demonstrate how relationship types influence communication dynamics through the task of predicting future retweets. Adding relationships as a feature to a strong baseline model increases the F1 and recall by 1% and 2%. The results of this study suggest relationship types have the potential to provide new insights into how communication and information diffusion occur in social networks.more » « less
-
null (Ed.)Topics in conversations depend in part on the type of interpersonal relationship between speakers, such as friendship, kinship, or romance. Identifying these relationships can provide a rich description of how individuals communicate and reveal how relationships influence the way people share information. Using a dataset of more than 9.6M dyads of Twitter users, we show how relationship types influence language use, topic diversity, communication frequencies, and diurnal patterns of conversations. These differences can be used to predict the relationship between two users, with the best predictive model achieving a macro F1 score of 0.70. We also demonstrate how relationship types influence communication dynamics through the task of predicting future retweets. Adding relationships as a feature to a strong baseline model increases the F1 and recall by 1% and 2%. The results of this study suggest relationship types have the potential to provide new insights into how communication and information diffusion occur in social networks.more » « less
-
Online conversations can go in many directions: some turn out poorly due to antisocial behavior, while others turn out positively to the benefit of all. Research on improving online spaces has focused primarily on detecting and reducing antisocial behavior. Yet we know little about positive outcomes in online conversations and how to increase them—is a prosocial outcome simply the lack of antisocial behavior or something more? Here, we examine how conversational features lead to prosocial outcomes within online discussions. We introduce a series of new theory-inspired metrics to define prosocial outcomes such as mentoring and esteem enhancement. Using a corpus of 26M Reddit conversations, we show that these outcomes can be forecasted from the initial comment of an online conversation, with the best model providing a relative 24% improvement over human forecasting performance at ranking conversations for predicted outcome. Our results indicate that platforms can use these early cues in their algorithmic ranking of early conversations to prioritize better outcomes.more » « less
-
Individuals signal aspects of their identity and beliefs through linguistic choices. Studying these choices in aggregate allows us to examine large-scale attitude shifts within a population. Here, we develop computational methods to study word choice within a sociolinguistic lexical variable—alternate words used to express the same concept—in order to test for change in the United States towards sexuality and gender. We examine two variables: i) referents to significant others, such as the word “partner” and ii) referents to an indefinite person, both of which could optionally be marked with gender. The linguistic choices in each variable allow us to study increased rates of acceptances of gay marriage and gender equality, respectively. In longitudinal analyses across Twitter and Reddit over 87M messages, we demonstrate that attitudes are changing but that these changes are driven by specific demographics within the United States. Further, in a quasi-causal analysis, we show that passages of Marriage Equality Acts in different states are drivers of linguistic change.more » « less
-
Certainty and uncertainty are fundamental to science communication. Hedges have widely been used as proxies for uncertainty. However, certainty is a complex construct, with authors expressing not only the degree but the type and aspects of uncertainty in order to give the reader a certain impression of what is known. Here, we introduce a new study of certainty that models both the level and the aspects of certainty in scientific findings. Using a new dataset of 2167 annotated scientific findings, we demonstrate that hedges alone account for only a partial explanation of certainty. We show that both the overall certainty and individual aspects can be predicted with pre-trained language models, providing a more complete picture of the author’s intended communication. Downstream analyses on 431K scientific findings from news and scientific abstracts demonstrate that modeling sentence-level and aspect-level certainty is meaningful for areas like science communication. Both the model and datasets used in this paper are released at https://blablablab.si.umich.edu/projects/certainty/.more » « less
-
New words are regularly introduced to communities, yet not all of these words persist in a community's lexicon. Among the many factors contributing to lexical change, we focus on the understudied effect of social networks. We conduct a large-scale analysis of over 80k neologisms in 4420 online communities across a decade. Using Poisson regression and survival analysis, our study demonstrates that the community's network structure plays a significant role in lexical change. Apart from overall size, properties including dense connections, the lack of local clusters and more external contacts promote lexical innovation and retention. Unlike offline communities, these topic-based communities do not experience strong lexical levelling despite increased contact but accommodate more niche words. Our work provides support for the sociolinguistic hypothesis that lexical change is partially shaped by the structure of the underlying network but also uncovers findings specific to online communities.more » « less
-
null (Ed.)Intimacy is a fundamental aspect of how we relate to others in social settings. Language encodes the social information of intimacy through both topics and other more subtle cues (such as linguistic hedging and swearing). Here, we introduce a new computational framework for studying expressions of the intimacy in language with an accompanying dataset and deep learning model for accurately predicting the intimacy level of questions (Pearson r = 0.87). Through analyzing a dataset of 80.5M questions across social media, books, and films, we show that individuals employ interpersonal pragmatic moves in their language to align their intimacy with social settings. Then, in three studies, we further demonstrate how individuals modulate their intimacy to match social norms around gender, social distance, and audience, each validating key findings from studies in social psychology. Our work demonstrates that intimacy is a pervasive and impactful social dimension of language.more » « less
-
null (Ed.)Offering condolence is a natural reaction to hearing someone’s distress. Individuals frequently express distress in social media, where some communities can provide support. However, not all condolence is equal—trite responses offer little actual support despite their good intentions. Here, we develop computational tools to create a massive dataset of 11.4M expressions of distress and 2.8M corresponding offerings of condolence in order to examine the dynamics of condolence online. Our study reveals widespread disparity in what types of distress receive supportive condolence rather than just engagement. Building on studies from social psychology, we analyze the language of condolence and develop a new dataset for quantifying the empathy in a condolence using appraisal theory. Finally, we demonstrate that the features of condolence individuals find most helpful online differ substantially in their features from those seen in interpersonal settings.more » « less
An official website of the United States government

Full Text Available