skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Generative AI models should include detection mechanisms as a condition for public release
Abstract The new wave of ‘foundation models’—general-purpose generative AI models, for production of text (e.g., ChatGPT) or images (e.g., MidJourney)—represent a dramatic advance in the state of the art for AI. But their use also introduces a range of new risks, which has prompted an ongoing conversation about possible regulatory mechanisms. Here we propose a specific principle that should be incorporated into legislation: that any organization developing a foundation model intended for public use must demonstrate a reliabledetection mechanismfor the content it generates, as a condition of its public release. The detection mechanism should be made publicly available in a tool that allows users to query, for an arbitrary item of content, whether the item was generated (wholly or partly) by the model. In this paper, we argue that this requirement is technically feasible and would play an important role in reducing certain risks from new AI models in many domains. We also outline a number of options for the tool’s design, and summarize a number of points where further input from policymakers and researchers would be required.  more » « less
Award ID(s):
2229876
PAR ID:
10524776
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
Ethics and Information Technology
Date Published:
Journal Name:
Ethics and Information Technology
Volume:
25
Issue:
4
ISSN:
1388-1957
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Social service providers play a vital role in the developmental outcomes of underprivileged youth as they transition into adulthood. Educators, mental health professionals, juvenile justice officers, and child welfare caseworkers often have first-hand knowledge of the trials uniquely faced by these vulnerable youth and are charged with mitigating harmful risks, such as mental health challenges, child abuse, drug use, and sex trafficking. Yet, less is known about whether or how social service providers assess and mitigate the online risk experiences of youth under their care. Therefore, as part of the National Science Foundation (NSF) I-Corps program, we conducted interviews with 37 social service providers (SSPs) who work with underprivileged youth to determine what (if any) online risks are most concerning to them given their role in youth protection, how they assess or become aware of these online risk experiences, and whether they see value in the possibility of using artificial intelligence (AI) as a potential solution for online risk detection. Overall, online sexual risks (e.g., sexual grooming and abuse) and cyberbullying were the most salient concern across all social service domains, especially when these experiences crossed the boundary between the digital and the physical worlds. Yet, SSPs had to rely heavily on youth self-reports to know whether and when online risks occurred, which required building a trusting relationship with youth; otherwise, SSPs became aware only after a formal investigation had been launched. Therefore, most SSPs found value in the potential for using AI as an early detection system and to monitor youth, but they were concerned that such a solution would not be feasible due to a lack of resources to adequately respond to online incidences, access to the necessary digital trace data (e.g., social media), context, and concerns about violating the trust relationships they built with youth. Thus, such automated risk detection systems should be designed and deployed with caution, as their implementation could cause youth to mistrust adults, thereby limiting the receipt of necessary guidance and support. We add to the bodies of research on adolescent online safety and the benefits and challenges of leveraging algorithmic systems in the public sector. 
    more » « less
  2. Abstract To date, many AI initiatives (eg, AI4K12, CS for All) developed standards and frameworks as guidance for educators to create accessible and engaging Artificial Intelligence (AI) learning experiences for K‐12 students. These efforts revealed a significant need to prepare youth to gain a fundamental understanding of how intelligence is created, applied, and its potential to perpetuate bias and unfairness. This study contributes to the growing interest in K‐12 AI education by examining student learning of modelling real‐world text data. Four students from an Advanced Placement computer science classroom at a public high school participated in this study. Our qualitative analysis reveals that the students developed nuanced and in‐depth understandings of how text classification models—a type of AI application—are trained. Specifically, we found that in modelling texts, students: (1) drew on their social experiences and cultural knowledge to create predictive features, (2) engineered predictive features to address model errors, (3) described model learning patterns from training data and (4) reasoned about noisy features when comparing models. This study contributes to an initial understanding of student learning of modelling unstructured data and offers implications for scaffolding in‐depth reasoning about model decision making. Practitioner notesWhat is already known about this topicScholarly attention has turned to examining Artificial Intelligence (AI) literacy in K‐12 to help students understand the working mechanism of AI technologies and critically evaluate automated decisions made by computer models.While efforts have been made to engage students in understanding AI through building machine learning models with data, few of them go in‐depth into teaching and learning of feature engineering, a critical concept in modelling data.There is a need for research to examine students' data modelling processes, particularly in the little‐researched realm of unstructured data.What this paper addsResults show that students developed nuanced understandings of models learning patterns in data for automated decision making.Results demonstrate that students drew on prior experience and knowledge in creating features from unstructured data in the learning task of building text classification models.Students needed support in performing feature engineering practices, reasoning about noisy features and exploring features in rich social contexts that the data set is situated in.Implications for practice and/or policyIt is important for schools to provide hands‐on model building experiences for students to understand and evaluate automated decisions from AI technologies.Students should be empowered to draw on their cultural and social backgrounds as they create models and evaluate data sources.To extend this work, educators should consider opportunities to integrate AI learning in other disciplinary subjects (ie, outside of computer science classes). 
    more » « less
  3. Responsible AI (RAI) is the science and practice of ensuring the design, development, use, and oversight of AI are socially sustainable---benefiting diverse stakeholders while controlling the risks. Achieving this goal requires active engagement and participation from the broader public. This paper introduces We are AI: Taking Control of Technology, a public education course that brings the topics of AI and RAI to the general audience in a peer-learning setting. We outline the goals behind the course's development, discuss the multi-year iterative process that shaped its creation, and summarize its content. We also discuss two offerings of We are AI to an active and engaged group of librarians and professional staff at New York University, highlighting successes and areas for improvement. The course materials, including a multilingual comic book series by the same name, are publicly available and can be used independently. By sharing our experience in creating and teaching We are AI, we aim to introduce these resources to the community of AI educators, researchers, and practitioners, supporting their public education efforts. 
    more » « less
  4. AbstractRecent advances in generative artificial intelligence (AI) and multimodal learning analytics (MMLA) have allowed for new and creative ways of leveraging AI to support K12 students' collaborative learning in STEM+C domains. To date, there is little evidence of AI methods supporting students' collaboration in complex, open‐ended environments. AI systems are known to underperform humans in (1) interpreting students' emotions in learning contexts, (2) grasping the nuances of social interactions and (3) understanding domain‐specific information that was not well‐represented in the training data. As such, combined human and AI (ie, hybrid) approaches are needed to overcome the current limitations of AI systems. In this paper, we take a first step towards investigating how a human‐AI collaboration between teachers and researchers using an AI‐generated multimodal timeline can guide and support teachers' feedback while addressing students' STEM+C difficulties as they work collaboratively to build computational models and solve problems. In doing so, we present a framework characterizing the human component of our human‐AI partnership as a collaboration between teachers and researchers. To evaluate our approach, we present our timeline to a high school teacher and discuss the key insights gleaned from our discussions. Our case study analysis reveals the effectiveness of an iterative approach to using human‐AI collaboration to address students' STEM+C challenges: the teacher can use the AI‐generated timeline to guide formative feedback for students, and the researchers can leverage the teacher's feedback to help improve the multimodal timeline. Additionally, we characterize our findings with respect to two events of interest to the teacher: (1) when the students cross adifficulty threshold,and (2) thepoint of intervention, that is, when the teacher (or system) should intervene to provide effective feedback. It is important to note that the teacher explained that there should be a lag between (1) and (2) to give students a chance to resolve their own difficulties. Typically, such a lag is not implemented in computer‐based learning environments that provide feedback. Practitioner notesWhat is already known about this topicCollaborative, open‐ended learning environments enhance students' STEM+C conceptual understanding and practice, but they introduce additional complexities when students learn concepts spanning multiple domains.Recent advances in generative AI and MMLA allow for integrating multiple datastreams to derive holistic views of students' states, which can support more informed feedback mechanisms to address students' difficulties in complex STEM+C environments.Hybrid human‐AI approaches can help address collaborating students' STEM+C difficulties by combining the domain knowledge, emotional intelligence and social awareness of human experts with the general knowledge and efficiency of AI.What this paper addsWe extend a previous human‐AI collaboration framework using a hybrid intelligence approach to characterize the human component of the partnership as a researcher‐teacher partnership and present our approach as a teacher‐researcher‐AI collaboration.We adapt an AI‐generated multimodal timeline to actualize our human‐AI collaboration by pairing the timeline with videos of students encountering difficulties, engaging in active discussions with a high school teacher while watching the videos to discern the timeline's utility in the classroom.From our discussions with the teacher, we define two types ofinflection pointsto address students' STEM+C difficulties—thedifficulty thresholdand theintervention point—and discuss how thefeedback latency intervalseparating them can inform educator interventions.We discuss two ways in which our teacher‐researcher‐AI collaboration can help teachers support students encountering STEM+C difficulties: (1) teachers using the multimodal timeline to guide feedback for students, and (2) researchers using teachers' input to iteratively refine the multimodal timeline.Implications for practice and/or policyOur case study suggests that timeline gaps (ie, disengaged behaviour identified by off‐screen students, pauses in discourse and lulls in environment actions) are particularly important for identifying inflection points and formulating formative feedback.Human‐AI collaboration exists on a dynamic spectrum and requires varying degrees of human control and AI automation depending on the context of the learning task and students' work in the environment.Our analysis of this human‐AI collaboration using a multimodal timeline can be extended in the future to support students and teachers in additional ways, for example, designing pedagogical agents that interact directly with students, developing intervention and reflection tools for teachers, helping teachers craft daily lesson plans and aiding teachers and administrators in designing curricula. 
    more » « less
  5. Summary The iconic, palmately compound leaves ofCannabishave attracted significant attention in the past. However, investigations into the genetic basis of leaf shape or its connections to phytochemical composition have yielded inconclusive results. This is partly due to prominent changes in leaflet number within a single plant during development, which has so far prevented the proper use of common morphometric techniques.Here, we present a new method that overcomes the challenge of nonhomologous landmarks in palmate, pinnate, and lobed leaves, usingCannabisas an example. We model corresponding pseudo‐landmarks for each leaflet as angle‐radius coordinates and model them as a function of leaflet to create continuous polynomial models, bypassing the problems associated with variable number of leaflets between leaves.We analyze 341 leaves from 24 individuals from nineCannabisaccessions. Using 3591 pseudo‐landmarks in modeled leaves, we accurately predict accession identity, leaflet number, and relative node number.Intra‐leaf modeling offers a rapid, cost‐effective means of identifyingCannabisaccessions, making it a valuable tool for future taxonomic studies, cultivar recognition, and possibly chemical content analysis and sex identification, in addition to permitting the morphometric analysis of leaves in any species with variable numbers of leaflets or lobes. 
    more » « less