skip to main content

Search for: All records

Creators/Authors contains: "Pennebaker, James W."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    To what degree can we determine people's connections with groups through the language they use? In recent years, large archives of behavioral data from social media communities have become available to social scientists, opening the possibility of tracking naturally occurring group identity processes. A feature of most digital groups is that they rely exclusively on the written word. Across 3 studies, we developed and validated a language-based metric of group identity strength and demonstrated its potential in tracking identity processes in online communities. In Studies 1a–1c, 873 people wrote about their connections to various groups (country, college, or religion). A total of 2 language markers of group identity strength were found: high affiliation (more words like we, togetherness) and low cognitive processing or questioning (fewer words like think, unsure). Using these markers, a language-based unquestioning affiliation index was developed and applied to in-class stream-of-consciousness essays of 2,161 college students (Study 2). Greater levels of unquestioning affiliation expressed in language predicted not only self-reported university identity but also students’ likelihood of remaining enrolled in college a year later. In Study 3, the index was applied to naturalistic Reddit conversations of 270,784 people in 2 online communities of supporters of the 2016 presidential candidates—Hillary Clinton and Donald Trump. The index predicted how long people would remain in the group (3a) and revealed temporal shifts mirroring members’ joining and leaving of groups (3b). Together, the studies highlight the promise of a language-based approach for tracking and studying group identity processes in online groups.

    more » « less
  2. Using archived social media data, the language signatures of people going through breakups were mapped. Text analyses were conducted on 1,027,541 posts from 6,803 Reddit users who had posted about their breakups. The posts include users’ Reddit history in the 2 y surrounding their breakups across the various domains of their life, not just posts pertaining to their relationship. Language markers of an impending breakup were evident 3 mo before the event, peaking on the week of the breakup and returning to baseline 6 mo later. Signs included an increase in I-words, we-words, and cognitive processing words (characteristic of depression, collective focus, and the meaning-making process, respectively) and drops in analytic thinking (indicating more personal and informal language). The patterns held even when people were posting to groups unrelated to breakups and other relationship topics. People who posted about their breakup for longer time periods were less well-adjusted a year after their breakup compared to short-term posters. The language patterns seen for breakups replicated for users going through divorce (n= 5,144; 1,109,867 posts) or other types of upheavals (n= 51,357; 11,081,882 posts). The cognitive underpinnings of emotional upheavals are discussed using language as a lens.

    more » « less
  3. Abstract

    To date we know little about natural emotion word repertoires, and whether or how they are associated with emotional functioning. Principles from linguistics suggest that the richness or diversity of individuals’ actively used emotion vocabularies may correspond with their typical emotion experiences. The current investigation measures active emotion vocabularies in participant-generated natural speech and examined their relationships to individual differences in mood, personality, and physical and emotional well-being. Study 1 analyzes stream-of-consciousness essays by 1,567 college students. Study 2 analyzes public blogs written by over 35,000 individuals. The studies yield consistent findings that emotion vocabulary richness corresponds broadly with experience. Larger negative emotion vocabularies correlate with more psychological distress and poorer physical health. Larger positive emotion vocabularies correlate with higher well-being and better physical health. Findings support theories linking language use and development with lived experience and may have future clinical implications pending further research.

    more » « less
  4. Individuals who are “strongly fused” with a group view the group as self-defining. As such, they should be particularly reluctant to leave it. For the first time, we investigate the implications of identity fusion for university retention. We found that students who were strongly fused with their university (+1 SD) were 7–9% points more likely than weakly fused students (−1 SD) to remain in school up to a year later. Fusion with university predicted subsequent retention in four samples ( N = 3,193) and held while controlling for demographics, personality, prior academic performance, and belonging uncertainty. Interestingly, fusion with university was largely unrelated to grades, suggesting that identity fusion provides a novel pathway to retention independent of established pathways like academic performance. We discuss the theoretical and practical implications of these findings. 
    more » « less
  5. From many perspectives, the election of Donald Trump was seen as a departure from long-standing political norms. An analysis of Trump’s word use in the presidential debates and speeches indicated that he was exceptionally informal but at the same time, spoke with a sense of certainty. Indeed, he is lower in analytic thinking and higher in confidence than almost any previous American president. Closer analyses of linguistic trends of presidential language indicate that Trump’s language is consistent with long-term linear trends, demonstrating that he is not as much an outlier as he initially seems. Across multiple corpora from the American presidents, non-US leaders, and legislative bodies spanning decades, there has been a general decline in analytic thinking and a rise in confidence in most political contexts, with the largest and most consistent changes found in the American presidency. The results suggest that certain aspects of the language style of Donald Trump and other recent leaders reflect long-evolving political trends. Implications of the changing nature of popular elections and the role of media are discussed.

    more » « less
  6. Objective

    To understand what terms people seeking information about gout use most frequently in online searches and to explore the psychological and emotional tone of these searches.


    A large de‐identified data set of search histories from major search engines was analyzed. Participants who searched for gout (n = 1,117), arthritis (arthritis search control group, age and sex‐matched, n = 2,036), and a random set of age and sex‐matched participants (general control group, n = 2,150) were included. Searches were analyzed using Meaning Extraction Helper and Linguistic Inquiry and Word Count.


    The most frequent unique searches in the gout search group included gout‐related and food‐related terms. Those who searched for gout were most likely to search for words related to eating or avoidance. In contrast, those who searched for arthritis were more likely to search for disease‐ or health‐related words. Compared with the general control group, higher information seeking was observed for the gout and arthritis search groups. Compared with the general control group, both the gout and arthritis search groups searched for more food‐related words and fewer leisure and sex‐related words. The searches of both the gout and arthritis search groups were lower in positivity and higher in the frequency of sadness‐related words.


    The perception of gout as a condition managed by dietary strategies aligns with online information seeking about the disease and its management. In contrast, people searching for information about arthritis focus more on medical strategies. Linguistic analyses reflect greater disability in social and leisure activities and lower positive emotion for those searching for gout or arthritis.

    more » « less