skip to main content


Title: User Demographics and Censorship on Sina Weibo
This paper investigates the relationship between demographics and the frequency of censored posts (weibos) on Sina Weibo. Our results indicate that demographics such as location, gender and paid for features do not provide a good degree of predictive power but help explain how censorship is applied on social media. Using a dataset of 226 million weibos collected in 2012, we apply a binomial regression model to evaluate the predictive quality of user demographics to identify candidates that may be targeted for censorship. Our results suggest male users who are verified (pay for mobile and security features) are more likely to be censored than females or users who are not verified. In addition, users from provinces such as Hong Kong, Macao, and Beijing are more heavily censored compared to any other province in China over the same period.  more » « less
Award ID(s):
1704113
NSF-PAR ID:
10463466
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proceedings of the 54th Hawaii International Conference on System Sciences
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Internet censorship imposes restrictions on what information can be publicized or viewed on the Internet. According to Freedom House’s annual Freedom on the Net report, more than half the world’s Internet users now live in a place where the Internet is censored or restricted. China has built the world’s most extensive and sophisticated online censorship system. In this paper, we describe a new corpus of censored and uncensored social media tweets from a Chinese microblogging website, Sina Weibo, collected by tracking posts that mention ‘sensitive’ topics or authored by ‘sensitive’ users. We use this corpus to build a neural network classifier to predict censorship. Our model performs with a 88.50% accuracy using only linguistic features. We discuss these features in detail and hypothesize that they could potentially be used for censorship circumvention. 
    more » « less
  2. This paper studies how the linguistic components of blogposts collected from Sina Weibo, a Chinese microblogging platform, might affect the blogposts’ likelihood of being censored. Our results go along with King et al. (2013)’s Collective Action Potential (CAP) theory, which states that a blogpost’s potential of causing riot or assembly in real life is the key determinant of it getting censored. Although there is not a definitive measure of this construct, the linguistic features that we identify as discriminatory go along with the CAP theory. We build a classifier that significantly outperforms non-expert humans in predicting whether a blogpost will be censored. The crowdsourcing results suggest that while humans tend to see censored blogposts as more controversial and more likely to trigger action in real life than the uncensored counterparts, they in general cannot make a better guess than our model when it comes to ‘reading the mind’ of the censors in deciding whether a blogpost should be censored. We do not claim that censorship is only determined by the linguistic features. There are many other factors contributing to censorship decisions. The focus of the present paper is on the linguistic form of blogposts. Our work suggests that it is possible to use linguistic properties of social media posts to automatically predict if they are going to be censored. 
    more » « less
  3. For the past 20 years, China has increasingly restricted the access of minors to online games using addiction prevention systems (APSes). At the same time, and through different means, i.e., the Great Firewall of China (GFW), it also restricts general population access to the international Internet. This paper studies how these restrictions impact young online gamers, and their evasion efforts. We present results from surveys (n = 2,415) and semi-structured interviews (n = 35) revealing viable commonly deployed APS evasion techniques and APS vulnerabilities. We conclude that the APS does not work as designed, even against very young online game players, and can act as a censorship evasion training ground for tomorrow’s adults, by familiarization with and normalization of general evasion techniques, and desensitization to their dangers. Findings from these studies may further inform developers of censorship-resistant systems about the perceptions and evasion strategies of their prospective users, and help design tools that leverage services and platforms popular among the censored audience. 
    more » « less
  4. For the past 20 years, China has increasingly restricted the access of minors to online games using addiction prevention systems (APSes). At the same time, and through different means, i.e., the Great Firewall of China (GFW), it also restricts general population access to the international Internet. This paper studies how these restrictions impact young online gamers, and their evasion efforts. We present results from surveys (n = 2,415) and semi-structured interviews (n = 35) revealing viable commonly deployed APS evasion techniques and APS vulnerabilities. We conclude that the APS does not work as designed, even against very young online game players, and can act as a censorship evasion training ground for tomorrow’s adults, by familiarization with and normalization of general evasion techniques, and desensitization to their dangers. Findings from these studies may further inform developers of censorship-resistant systems about the perceptions and evasion strategies of their prospective users, and help design tools that leverage services and platforms popular among the censored audience. 
    more » « less
  5. null (Ed.)
    Abstract Refraction networking is a next-generation censorship circumvention approach that locates proxy functionality in the network itself, at participating ISPs or other network operators. Following years of research and development and a brief pilot, we established the world’s first production deployment of a Refraction Networking system. Our deployment uses a highperformance implementation of the TapDance protocol and is enabled as a transport in the popular circumvention app Psiphon. It uses TapDance stations at four physical uplink locations of a mid-sized ISP, Merit Network, with an aggregate bandwidth of 140 Gbps. By the end of 2019, our system was enabled as a transport option in 559,000 installations of Psiphon, and it served upwards of 33,000 unique users per month. This paper reports on our experience building the deployment and operating it for the first year. We describe how we overcame engineering challenges, present detailed performance metrics, and analyze how our system has responded to dynamic censor behavior. Finally, we review lessons learned from operating this unique artifact and discuss prospects for further scaling Refraction Networking to meet the needs of censored users. 
    more » « less