Demeniconi; Carlotta; Nitesh V. Chawla
(Ed.)
The motives and means of explicit state censorship have been well studied, both quantitatively and qualitatively. Self-censorship by media outlets, however, has not received nearly as much attention, mostly because it is difficult to systematically detect. We develop a novel approach to identify news media self-censorship by using social media as a sensor. We develop a hypothesis testing framework to identify and evaluate censored clusters of keywords and a near-linear-time algorithm (called GraphDPD) to identify the highest-scoring clusters as indicators of censorship. We evaluate the accuracy of our framework, versus other state-of-the-art algorithms, using both semi-synthetic and real-world data from Mexico and Venezuela during Year 2014. These tests demonstrate the capacity of our framework to identify self-censorship and provide an indicator of broader media freedom. The results of this study lay the foundation for detection, study, and policy-response to self-censorship.
more »
« less