We collected Instagram data from 150 adolescents (ages 13-21) that included 15,547 private message conversations of which 326 conversations were flagged as sexually risky by participants. Based on this data, we leveraged a human-centered machine learning approach to create sexual risk detection classifiers for youth social media conversations. Our Convolutional Neural Network (CNN) and Random Forest models outperformed in identifying sexual risks at the conversation-level (AUC=0.88), and CNN outperformed at the message-level (AUC=0.85). We also trained classifiers to detect the severity risk level (i.e., safe, low, medium-high) of a given message with CNN outperforming other models (AUC=0.88). A feature analysis yielded deeper insights into patterns found within sexually safe versus unsafe conversations. We found that contextual features (e.g., age, gender, and relationship type) and Linguistic Inquiry and Word Count (LIWC) contributed the most for accurately detecting sexual conversations that made youth feel uncomfortable or unsafe. Our analysis provides insights into the important factors and contextual features that enhance automated detection of sexual risks within youths' private conversations. As such, we make valuable contributions to the computational risk detection and adolescent online safety literature through our human-centered approach of collecting and ground truth coding private social media conversations of youth for the purpose of risk classification.
more »
« less
This content will become publicly available on June 7, 2026
Timeliness Matters: Leveraging Reinforcement Learning on Social Media Data to Prioritize High-Risk Conversations for Promoting Youth Online Safety
Ensuring the online safety of youth has motivated research towards the development of machine learning (ML) methods capable of accurately detecting social media risks after-the-fact. However, for these detection models to be effective, they must proactively identify high-risk scenarios (e.g., sexual solicitations, cyberbullying) to mitigate harm. This `real-time' responsiveness is a recognized challenge within the risk detection literature. Therefore, this paper presents a novel two-level framework that first uses reinforcement learning to identify conversation stop points to prioritize messages for evaluation. Then, we optimize state-of-the-art deep learning models to accurately categorize risk priority (low, high). We apply this framework to a time-based simulation using a rich dataset of 23K private conversations with over 7 million messages donated by 194 youth (ages 13-21). We conducted an experiment comparing our new approach to a traditional conversation-level baseline. We found that the timeliness of conversations significantly improved from over 2 hours to approximately 16 minutes with only a slight reduction in accuracy (0.88 to 0.84). This study advances real-time detection approaches for social media data and provides a benchmark for future training reinforcement learning that prioritizes the timeliness of classifying high-risk conversations.
more »
« less
- Award ID(s):
- 2329976
- PAR ID:
- 10634955
- Publisher / Repository:
- AAAI 2025
- Date Published:
- Journal Name:
- Proceedings of the International AAAI Conference on Web and Social Media
- Volume:
- 19
- ISSN:
- 2162-3449
- Page Range / eLocation ID:
- 37 to 51
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Instagram, one of the most popular social media platforms among youth, has recently come under scrutiny for potentially being harmful to the safety and well-being of our younger generations. Automated approaches for risk detection may be one way to help mitigate some of these risks if such algorithms are both accurate and contextual to the types of online harms youth face on social media platforms. However, the imminent switch by Instagram to end-to-end encryption for private conversations will limit the type of data that will be available to the platform to detect and mitigate such risks. In this paper, we investigate which indicators are most helpful in automatically detecting risk in Instagram private conversations, with an eye on high-level metadata, which will still be available in the scenario of end-to-end encryption. Toward this end, we collected Instagram data from 172 youth (ages 13-21) and asked them to identify private message conversations that made them feel uncomfortable or unsafe. Our participants risk-flagged 28,725 conversations that contained 4,181,970 direct messages, including textual posts and images. Based on this rich and multimodal dataset, we tested multiple feature sets (metadata, linguistic cues, and image features) and trained classifiers to detect risky conversations. Overall, we found that the metadata features (e.g., conversation length, a proxy for participant engagement) were the best predictors of risky conversations. However, for distinguishing between risk types, the different linguistic and media cues were the best predictors. Based on our findings, we provide design implications for AI risk detection systems in the presence of end-to-end encryption. More broadly, our work contributes to the literature on adolescent online safety by moving toward more robust solutions for risk detection that directly takes into account the lived risk experiences of youth.more » « less
-
Although youth increasingly communicate with peers online, we know little about how private online channels play a role in providing a supportive environment for youth. To fill this gap, we asked youth to donate their Instagram Direct Messages and filtered them by the phrase “help me.” From this query, we analyzed 82 conversations comprised of 336,760 messages that 42 participants donated. These threads often began as casual conversations among friends or lovers they met offline or online. The conversations evolved into sharing negative experiences about everyday stress (e.g., school, dating) to severe mental health disclosures (e.g., suicide). Disclosures were usually reciprocated with relatable experiences and positive peer support. We also discovered unsupport as a theme, where conversation members denied giving support, a unique finding in the online social support literature. We discuss the role of social media-based private channels and their implications for design in supporting youth’s mental health. Content Warning: This paper includes sensitive topics, including self-harm and suicide ideation. Reader discretion is advised.more » « less
-
Accurate real-time risk identification is vital to protecting social media users from online harm, which has driven research towards advancements in machine learning (ML). While strides have been made regarding the computational facets of algorithms for “real-time” risk detection, such research has not yet evaluated these advancements through a human-centered lens. To this end, we conducted a systematic literature review of 53 peer-reviewed articles on real-time risk detection on social media. Real-time detection was mainly operationalized as “early” detection after-the-fact based on pre-defined chunks of data and evaluated based on standard performance metrics, such as timeliness. We identified several human-centered opportunities for advancing current algorithms, such as integrating human insight in feature selection, algorithms’ improvement considering human behavior, and utilizing human evaluations. This work serves as a critical call-to-action for the HCI and ML communities to work together to protect social media users before, during, and after exposure to risks.more » « less
-
We collected Instagram Direct Messages (DMs) from 100 adolescents and young adults (ages 13-21) who then flagged their own conversations as safe or unsafe. We performed a mixed-method analysis of the media files shared privately in these conversations to gain human-centered insights into the risky interactions experienced by youth. Unsafe conversations ranged from unwanted sexual solicitations to mental health related concerns, and images shared in unsafe conversations tended to be of people and convey negative emotions, while those shared in regular conversations more often conveyed positive emotions and contained objects. Further, unsafe conversations were significantly shorter, suggesting that youth disengaged when they felt unsafe. Our work uncovers salient characteristics of safe and unsafe media shared in private conversations and provides the foundation to develop automated systems for online risk detection and mitigation.more » « less
An official website of the United States government
