skip to main content


Title: Look Who's Talking: Inferring Speaker Attributes from Personal Longitudinal Dialog
We examine a large dialog corpus obtained from the conversation history of a single individual with 104 conversation partners. The corpus consists of half a million instant messages, across several messaging platforms. We focus our analyses on seven speaker attributes, each of which partitions the set of speakers, namely: gender; relative age; family member; romantic partner; classmate; co-worker; and native to the same country. In addition to the content of the messages, we examine conversational aspects such as the time messages are sent, messaging frequency, psycholinguistic word categories, linguistic mirroring, and graph-based features reflecting how people in the corpus mention each other. We present two sets of experiments predicting each attribute using (1) short context windows; and (2) a larger set of messages. We find that using all features leads to gains of 9-14% over using message text only.  more » « less
Award ID(s):
1815291
NSF-PAR ID:
10111348
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the 20th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Existing End-to-End secure messaging applications trust a single service provider to deliver messages in a consistent order to a consistent group of conversation members. We propose a protocol that removes this single point of failure by using multiple service providers, enforcing conversation integrity as long as one service provider out of N behave honestly. However, this approach could potentially increase the number of entities that learn the metadata for a conversation. In this work we discuss the challenges and provide a protocol that limits the metadata leakage to that of existing messaging applications while still providing strong conversation integrity. 
    more » « less
  2. The way high school chemistry curricula are structured has the potential to convey consequential messages about knowledge and knowing to students and teachers. If a curriculum is built around practicing skills and recalling facts to reach “correct” answers, it is unlikely class activities will be seen (by students or the teacher) as opportunities to figure out causes for phenomena. Our team of teachers and researchers is working to understand how enactment of transformed curricular materials can support high school chemistry students in making sense of perplexing, relatable phenomena. Given this goal, we were surprised to see that co-developers who enacted our materials overwhelmingly emphasized the importance of acquiring true facts/skills when writing weekly reflections. Recognition that teachers’ expressed aims did not align with our stated goal of “supporting molecular-level sensemaking” led us to examine whether the tacit epistemological commitments reflected by our materials were, in fact, consistent with a course focused on figuring out phenomena. We described several aspects of each lesson in our two-semester curriculum including: the role of phenomena in lesson activities, the extent to which lessons were 3-dimensional, the role of student ideas in class dialogue, and who established coherence between lessons. Triangulation of these lesson features enabled us to infer messages about valued knowledge products and processes materials had the potential to send. We observed that our materials commonly encouraged students to mimic the structure of science practices for the purpose of being evaluated by the teacher. That is, students were asked to “go through the motions” of explaining, modeling etc. but had little agency regarding the sorts of models and explanations they found productive in their class community. This study serves to illustrate the importance of surfacing the tacit epistemological commitments that guide curriculum development. Additionally, it extends existing scholarship on epistemological messaging by considering curricular materials as a potentially consequential sources of messages. 
    more » « less
  3. Solomon, Denise Haunani ; Brinberg, Miriam ; Bodie, Graham ; Jones, Susanne ; Ram, Nilam (Ed.)
    Conversations between people are where, among other things, stressors are amplified and attenuated, conflicts are entrenched and resolved, and goals are advanced and thwarted. What happens in dyads’ back-and-forth exchanges to produce such consequential and varied outcomes? Although numerous theories in communication and in social psychology address this question, empirical tests of these theories often operationalize conversational behavior using either discrete messages or overall features of the conversation. Dynamic systems theories and methods provide opportunities to examine the interdependency, self-stabilization, and self-organization processes that manifest in conversations over time. The dynamic dyadic systems perspective exemplified by the articles in this special issue (a) focuses inquiry on the turn-to-turn, asynchronous exchange of messages between two partners, (b) emphasizes behavioral patterns within and the structural and temporal organization of conversations, and (c) adapts techniques used in analysis of intensive longitudinal data to identify and operationalize those dynamic patterns. As an introduction to the special issue, this paper describes a dynamic dyadic systems perspective on conversation and discusses directions for future research, such as applications to humancomputer interaction, family communication patterns, health care interventions, and group deliberation. 
    more » « less
  4. Online conversations can go in many directions: some turn out poorly due to antisocial behavior, while others turn out positively to the benefit of all. Research on improving online spaces has focused primarily on detecting and reducing antisocial behavior. Yet we know little about positive outcomes in online conversations and how to increase them—is a prosocial outcome simply the lack of antisocial behavior or something more? Here, we examine how conversational features lead to prosocial outcomes within online discussions. We introduce a series of new theory-inspired metrics to define prosocial outcomes such as mentoring and esteem enhancement. Using a corpus of 26M Reddit conversations, we show that these outcomes can be forecasted from the initial comment of an online conversation, with the best model providing a relative 24% improvement over human forecasting performance at ranking conversations for predicted outcome. Our results indicate that platforms can use these early cues in their algorithmic ranking of early conversations to prioritize better outcomes. 
    more » « less
  5. Delays in response to mobile messages can cause negative emotions in message senders and can affect an individual's social relationships. Recipients, too, feel a pressure to respond even during inopportune moments. A messaging assistant which could respond with relevant contextual information on behalf of individuals while they are unavailable might reduce the pressure to respond immediately and help put the sender at ease. By modelling attentiveness to messaging, we aim to (1) predict instances when a user is not able to attend to an incoming message within reasonable time and (2) identify what contextual factors can explain the user's attentiveness---or lack thereof---to messaging. In this work, we investigate two approaches to modelling attentiveness: a general approach in which data from a group of users is combined to form a single model for all users; and a personalized approach, in which an individual model is created for each user. Evaluating both models, we observed that on average, with just seven days of training data, the personalized model can outperform the generalized model in terms of both accuracy and F-measure for predicting inattentiveness. Further, we observed that in majority of cases, the messaging patterns identified by the attentiveness models varied widely across users. For example, the top feature in the generalized model appeared in the top five features for only 41% of the individual personalized models. 
    more » « less