Cyberbullying is a growing problem across social media platforms, inflicting short and long-lasting effects on victims. To mitigate this problem, research has looked into building automated systems, powered by machine learning, to detect cyberbullying incidents, or the involved actors like victims and perpetrators. In the past, systematic reviews have examined the approaches within this growing body of work, but with a focus on the computational aspects of the technical innovation, feature engineering, or performance optimization, without centering around the roles, beliefs, desires, or expectations of humans. In this paper, we present a human-centered systematic literature review of the past 10 years of research on automated cyberbullying detection. We analyzed 56 papers based on a three-prong human-centeredness algorithm design framework - spanning theoretical, participatory, and speculative design. We found that the past literature fell short of incorporating human-centeredness across multiple aspects, ranging from defining cyberbullying, establishing the ground truth in data annotation, evaluating the performance of the detection models, to speculating the usage and users of the models, including potential harms and negative consequences. Given the sensitivities of the cyberbullying experience and the deep ramifications cyberbullying incidents bear on the involved actors, we discuss takeaways on how incorporating human-centeredness in future research can aid with developing detection systems that are more practical, useful, and tuned to the diverse needs and contexts of the stakeholders.
more »
« less
Harnessing the Power of Interdisciplinary Research with Psychology-Informed Cyberbullying Detection Models
Cyberbullying has become increasingly prevalent, particularly on social media. There has also been a steady rise in cyberbullying research across a range of disciplines. Much of the empirical work from computer science has focused on developing machine learning models for cyberbullying detection. Whereas machine learning cyberbullying detection models can be improved by drawing on psychological theories and perspectives, there is also tremendous potential for machine learning models to contribute to a better understanding of psychological aspects of cyberbullying. In this paper, we discuss how machine learning models can yield novel insights about the nature and defining characteristics of cyberbullying and how machine learning approaches can be applied to help clinicians, families, and communities reduce cyberbullying. Specifically, we discuss the potential for machine learning models to shed light on the repetitive nature of cyberbullying, the imbalance of power between cyberbullies and their victims, and causal mechanisms that give rise to cyberbullying. We orient our discussion on emerging and future research directions, as well as the practical implications of machine learning cyberbullying detection models.
more »
« less
- PAR ID:
- 10301305
- Publisher / Repository:
- Springer
- Date Published:
- Journal Name:
- International Journal of Bullying Prevention
- Volume:
- 4
- Issue:
- 1
- ISSN:
- 2523-3653
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Cyberbullying is rapidly becoming one of the most serious online risks for adolescents. This has motivated work on machine learning methods to automate the process of cyberbullying detection, which have so far mostly viewed cyberbullying as one-off incidents that occur at a single point in time. Comparatively less is known about how cyberbullying behavior occurs and evolves over time. This oversight highlights a crucial open challenge for cyberbullying-related research, given that cyberbullying is typically defined as intentional acts of aggression via electronic communication that occur repeatedly and persistently . In this article, we center our discussion on the challenge of modeling temporal patterns of cyberbullying behavior. Specifically, we investigate how temporal information within a social media session, which has an inherently hierarchical structure (e.g., words form a comment and comments form a session), can be leveraged to facilitate cyberbullying detection. Recent findings from interdisciplinary research suggest that the temporal characteristics of bullying sessions differ from those of non-bullying sessions and that the temporal information from users’ comments can improve cyberbullying detection. The proposed framework consists of three distinctive features: (1) a hierarchical structure that reflects how a social media session is formed in a bottom-up manner; (2) attention mechanisms applied at the word- and comment-level to differentiate the contributions of words and comments to the representation of a social media session; and (3) the incorporation of temporal features in modeling cyberbullying behavior at the comment-level. Quantitative and qualitative evaluations are conducted on a real-world dataset collected from Instagram, the social networking site with the highest percentage of users reporting cyberbullying experiences. Results from empirical evaluations show the significance of the proposed methods, which are tailored to capture temporal patterns of cyberbullying detection.more » « less
-
As online communication continues to become more prevalent, instances of cyberbullying have also become more common, particularly on social media sites. Previous research in this area has studied cyberbullying outcomes, predictors of cyberbullying victimization/perpetration, and computational detection models that rely on labeled datasets to identify the underlying patterns. However, there is a dearth of work examining the content of what is said when cyberbullying occurs and most of the available datasets include only basic la-bels (cyberbullying or not). This paper presents an annotated Instagram dataset with detailed labels about key cyberbullying properties, such as the content type, purpose, directionality, and co-occurrence with other phenomena, as well as demographic information about the individuals who performed the annotations. Additionally, results of an exploratory logistic regression analysis are reported to illustrate how new insights about cyberbullying and its automatic detection can be gained from this labeled dataset.more » « less
-
Concurrent with the growth and widespread use of social networking platforms has been a rise in the prevalence of cyberbullying and cyberharassment, particularly among youth. Although cyberbullying is frequently defined as hostile communication or interactions that occur repetitively via electronic media, little is known about the temporal aspects of cyberbullying on social media, such as how the number, frequency, and timing of posts may vary systematically between cyberbullying and non-cyberbullying social media sessions. In this paper, we aim to contribute to the understanding of temporal properties of cyberbullying through the analysis of Instagram data. That is, the paper presents key temporal characteristics of cyberbullying and trends obtained from descriptive and burst analysis tasks. Our results have the potential to inform the development of more effective cyberbullying detection models.more » « less
-
Social media is a vital means for information-sharing due to its easy access, low cost, and fast dissemination characteristics. However, increases in social media usage have corresponded with a rise in the prevalence of cyberbullying. Most existing cyberbullying detection methods are supervised and, thus, have two key drawbacks: (1) The data labeling process is often labor-intensive and time-consuming; (2) Current labeling guidelines may not be generalized to future instances because of different language usage and evolving social networks. To address these limitations, this work introduces a principled approach for unsupervised cyberbullying detection. The proposed model consists of two main components: (1) A representation learning network that encodes the social media session by exploiting multi-modal features, e.g., text, network, and time. (2) A multi-task learning network that simultaneously fits the time intervals and estimates the bullying likelihood based on a Gaussian Mixture Model. The proposed model jointly optimizes the parameters of both components to overcome the shortcomings of decoupled training. Our core contribution is an unsupervised cyberbullying detection model that not only experimentally outperforms the state-of-the-art unsupervised models, but also achieves competitive performance compared to supervised models.more » « less
An official website of the United States government
