skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Towards Automated Detection of Risky Images Shared by Youth on Social Media
With the growing ubiquity of the Internet and access to media-based social media platforms, the risks associated with media content sharing on social media and the need for safety measures against such risks have grown paramount. At the same time, risk is highly contextualized, especially when it comes to media content youth share privately on social media. In this work, we conducted qualitative content analyses on risky media content flagged by youth participants and research assistants of similar ages to explore contextual dimensions of youth online risks. The contextual risk dimensions were then used to inform semi- and self-supervised state-of-the-art vision transformers to automate the process of identifying risky images shared by youth. We found that vision transformers are capable of learning complex image features for use in automated risk detection and classification. The results of our study serve as a foundation for designing contextualized and youth-centered machine-learning methods for automated online risk detection.  more » « less
Award ID(s):
1827700 2333207
PAR ID:
10420122
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023
Page Range / eLocation ID:
1348 to 1357
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We collected Instagram data from 150 adolescents (ages 13-21) that included 15,547 private message conversations of which 326 conversations were flagged as sexually risky by participants. Based on this data, we leveraged a human-centered machine learning approach to create sexual risk detection classifiers for youth social media conversations. Our Convolutional Neural Network (CNN) and Random Forest models outperformed in identifying sexual risks at the conversation-level (AUC=0.88), and CNN outperformed at the message-level (AUC=0.85). We also trained classifiers to detect the severity risk level (i.e., safe, low, medium-high) of a given message with CNN outperforming other models (AUC=0.88). A feature analysis yielded deeper insights into patterns found within sexually safe versus unsafe conversations. We found that contextual features (e.g., age, gender, and relationship type) and Linguistic Inquiry and Word Count (LIWC) contributed the most for accurately detecting sexual conversations that made youth feel uncomfortable or unsafe. Our analysis provides insights into the important factors and contextual features that enhance automated detection of sexual risks within youths' private conversations. As such, we make valuable contributions to the computational risk detection and adolescent online safety literature through our human-centered approach of collecting and ground truth coding private social media conversations of youth for the purpose of risk classification. 
    more » « less
  2. Instagram, one of the most popular social media platforms among youth, has recently come under scrutiny for potentially being harmful to the safety and well-being of our younger generations. Automated approaches for risk detection may be one way to help mitigate some of these risks if such algorithms are both accurate and contextual to the types of online harms youth face on social media platforms. However, the imminent switch by Instagram to end-to-end encryption for private conversations will limit the type of data that will be available to the platform to detect and mitigate such risks. In this paper, we investigate which indicators are most helpful in automatically detecting risk in Instagram private conversations, with an eye on high-level metadata, which will still be available in the scenario of end-to-end encryption. Toward this end, we collected Instagram data from 172 youth (ages 13-21) and asked them to identify private message conversations that made them feel uncomfortable or unsafe. Our participants risk-flagged 28,725 conversations that contained 4,181,970 direct messages, including textual posts and images. Based on this rich and multimodal dataset, we tested multiple feature sets (metadata, linguistic cues, and image features) and trained classifiers to detect risky conversations. Overall, we found that the metadata features (e.g., conversation length, a proxy for participant engagement) were the best predictors of risky conversations. However, for distinguishing between risk types, the different linguistic and media cues were the best predictors. Based on our findings, we provide design implications for AI risk detection systems in the presence of end-to-end encryption. More broadly, our work contributes to the literature on adolescent online safety by moving toward more robust solutions for risk detection that directly takes into account the lived risk experiences of youth. 
    more » « less
  3. Computational approaches to detect the online risks that the youth encounter have presented promising potentials to protect them online. However, a major identified trend among these approaches is the lack of human-centered machine learning (HCML) aspect. It is necessary to move beyond the computational lens of the detection task to address the societal needs of such a vulnerable population. Therefore, I direct my attention in this dissertation to better understand youths’ risk experiences prior to enhancing the development of risk detection algorithms by 1) Examining youths’ (ages 13–17) public disclosures about sexual experiences and contextualizing these experiences based on the levels of consent (i.e., consensual, non-consensual, sexual abuse) and relationship types (i.e., stranger, dating/friend, family), 2) Moving beyond the sexual experiences to examine a broader array of risks within the private conversations of youth (N = 173) between 13 and 21 and contextualizing the dynamics of youth online and offline risks and the self-reports of risk experiences to the digital trace data, and 3) Building real-time machine learning models for risk detection by creating a contextualized framework. This dissertation provides a human-centered approach for improving automated real-time risk predictions that are derived from a contextualized understanding of the nuances relative to youths’ risk experiences. 
    more » « less
  4. We conducted a study with 173 adolescents (ages 13-21), who self-reported their offline and online risk experiences and uploaded their Instagram data to our study website to flag private conversations as unsafe. Risk profiles were first created based on the survey data and then compared with the risk-flagged social media data. Five risk profiles emerged: Low Risks (51% of the participants), Medium Risks (29%), Increased Sexting (8%), Increased Self-Harm (8%), and High Risk Perpetration (4%). Overall, the profiles correlated well with the social media data with the highest level of risk occurring in the three smallest profiles. Youth who experienced increased sexting and self-harm frequently reported engaging in unsafe sexual conversations. Meanwhile, high risk perpetration was characterized by increased violence, threats, and sales/promotion of illegal activities. A key insight from our study was that offline risk behavior sometimes manifested differently in online contexts (i.e., offline self-harm as risky online sexual interactions). Our findings highlight the need for targeted risk prevention strategies for youth online safety. 
    more » « less
  5. Social service providers play a vital role in the developmental outcomes of underprivileged youth as they transition into adulthood. Educators, mental health professionals, juvenile justice officers, and child welfare caseworkers often have first-hand knowledge of the trials uniquely faced by these vulnerable youth and are charged with mitigating harmful risks, such as mental health challenges, child abuse, drug use, and sex trafficking. Yet, less is known about whether or how social service providers assess and mitigate the online risk experiences of youth under their care. Therefore, as part of the National Science Foundation (NSF) I-Corps program, we conducted interviews with 37 social service providers (SSPs) who work with underprivileged youth to determine what (if any) online risks are most concerning to them given their role in youth protection, how they assess or become aware of these online risk experiences, and whether they see value in the possibility of using artificial intelligence (AI) as a potential solution for online risk detection. Overall, online sexual risks (e.g., sexual grooming and abuse) and cyberbullying were the most salient concern across all social service domains, especially when these experiences crossed the boundary between the digital and the physical worlds. Yet, SSPs had to rely heavily on youth self-reports to know whether and when online risks occurred, which required building a trusting relationship with youth; otherwise, SSPs became aware only after a formal investigation had been launched. Therefore, most SSPs found value in the potential for using AI as an early detection system and to monitor youth, but they were concerned that such a solution would not be feasible due to a lack of resources to adequately respond to online incidences, access to the necessary digital trace data (e.g., social media), context, and concerns about violating the trust relationships they built with youth. Thus, such automated risk detection systems should be designed and deployed with caution, as their implementation could cause youth to mistrust adults, thereby limiting the receipt of necessary guidance and support. We add to the bodies of research on adolescent online safety and the benefits and challenges of leveraging algorithmic systems in the public sector. 
    more » « less