skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Virality of Information Diffusion on WhatsApp
This paper explores the structural characteristics of information dissemination on WhatsApp, focusing particularly on the concepts of "breadth" and "depth." "Breadth" refers to the maximum number of groups to which a message is simultaneously forwarded, while "depth" indicates the maximum number of times a message is forwarded. Using a dataset from 1,600 groups in India comprising over 760,000 messages spanning text, images, and videos, this study employs hashing techniques to track message propagation in a privacy-preserving manner. Analysis of cascade size, breadth, and depth reveals significant trends: text and video messages tend to generate larger cascade sizes compared to images. Contrary to public platforms, depth emerges as the primary driver behind widespread information dissemination (which could be due to WhatsApp's limitations on message broadcasts). Additionally, distinct disparities among message types show depth as the decisive factor in text and video cascades, while both breadth and depth significantly contribute to image cascades. These findings underscore the importance of considering structural nuances in understanding information spread dynamics on private messaging platforms, providing valuable insights for effective dissemination strategies and management in digital communication landscapes.  more » « less
Award ID(s):
2318844
PAR ID:
10526762
Author(s) / Creator(s):
; ;
Publisher / Repository:
10th International Conference on Computational Social Science (IC2S2)
Date Published:
Format(s):
Medium: X
Location:
Philadelphia, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. We study how communication platforms can improve social learning without censoring or fact-checking messages, when they have members who deliberately and/or inadvertently distort information. Message fidelity depends on social network depth (how many times information can be relayed) and breadth (the number of others with whom a typical user shares information). We characterize how the expected number of true minus false messages depends on breadth and depth of the network and the noise structure. Message fidelity can be improved by capping depth or, if that is not possible, limiting breadth, e.g., by capping the number of people to whom someone can forward a given message. Although caps reduce total communication, they increase the fraction of received messages that have traveled shorter distances and have had less opportunity to be altered, thereby increasing the signal-to-noise ratio. 
    more » « less
  2. Large cascades can develop in online social networks as people share information with one another. Though simple reshare cascades have been studied extensively, the full range of cascading behaviors on social media is much more diverse. Here we study how diffusion protocols, or the social exchanges that enable information transmission, affect cascade growth, analogous to the way communication protocols define how information is transmitted from one point to another. Studying 98 of the largest information cascades on Facebook, we find a wide range of diffusion protocols - from cascading reshares of images, which use a simple protocol of tapping a single button for propagation, to the ALS Ice Bucket Challenge, whose diffusion protocol involved individuals creating and posting a video, and then nominating specific others to do the same. We find recurring classes of diffusion protocols, and identify two key counterbalancing factors in the construction of these protocols, with implications for a cascade's growth: the effort required to participate in the cascade, and the social cost of staying on the sidelines. Protocols requiring greater individual effort slow down a cascade's propagation, while those imposing a greater social cost of not participating increase the cascade's adoption likelihood. The predictability of transmission also varies with protocol. But regardless of mechanism, the cascades in our analysis all have a similar reproduction number (≈1.8), meaning that lower rates of exposure can be offset with higher per-exposure rates of adoption. Last, we show how a cascade's structure can not only differentiate these protocols, but also be modeled through branching processes. Together, these findings provide a framework for understanding how a wide variety of information cascades can achieve substantial adoption across a network. 
    more » « less
  3. We examine a large dialog corpus obtained from the conversation history of a single individual with 104 conversation partners. The corpus consists of half a million instant messages, across several messaging platforms. We focus our analyses on seven speaker attributes, each of which partitions the set of speakers, namely: gender; relative age; family member; romantic partner; classmate; co-worker; and native to the same country. In addition to the content of the messages, we examine conversational aspects such as the time messages are sent, messaging frequency, psycholinguistic word categories, linguistic mirroring, and graph-based features reflecting how people in the corpus mention each other. We present two sets of experiments predicting each attribute using (1) short context windows; and (2) a larger set of messages. We find that using all features leads to gains of 9-14% over using message text only. 
    more » « less
  4. null (Ed.)
    Abstract The potential of DNA as an information storage medium is rapidly growing due to advances in DNA synthesis and sequencing. However, the chemical stability of DNA challenges the complete erasure of information encoded in DNA sequences. Here, we encode information in a DNA information solution, a mixture of true message- and false message-encoded oligonucleotides, and enables rapid and permanent erasure of information. True messages are differentiated by their hybridization to a "truth marker” oligonucleotide, and only true messages can be read; binding of the truth marker can be effectively randomized even with a brief exposure to the elevated temperature. We show 8 separate bitmap images can be stably encoded and read after storage at 25 °C for 65 days with an average of over 99% correct information recall, which extrapolates to a half-life of over 15 years at 25 °C. Heating to 95 °C for 5 minutes, however, permanently erases the message. 
    more » « less
  5. We propose a capacity-achieving scheme for private information retrieval (PIR) from databases (DBs) with heterogeneous storage constraints. In the PIR setting, a user queries a set of DBs to privately download a message, where privacy implies that no one DB can infer which message the user desires. Our PIR scheme uses an uncoded storage placement and we derive sufficient conditions to meet capacity in this design architecture. We translate the storage placement design to a "filling problem" where messages are partitioned into sub- messages and stored at subsets of DBs. We prove a set of necessary and sufficient conditions for the existence of the filling problem solution and design an iterative algorithm to find a filling problem solution. Our proposed algorithm requires at most a number of iterations equal to the number of DBs. Furthermore, we significantly reduce the number of sub-messages compared to the state-of- the-art PIR scheme, as our proposed PIR scheme requires that each message is split into a polynomial number of sub-messages with respect to the number of DBs. 
    more » « less