Introduction: Because developing integrated computer science (CS) curriculum is a resource-intensive process, there is interest in leveraging the capabilities of AI tools, including large language models (LLMs), to streamline this task. However, given the novelty of LLMs, little is known about their ability to generate appropriate curriculum content. Research Question: How do current LLMs perform on the task of creating appropriate learning activities for integrated computer science education? Methods: We tested two LLMs (Claude 3.5 Sonnet and ChatGPT 4-o) by providing them with a subset of national learning standards for both CS and language arts and asking them to generate a high-level description of learning activities that met standards for both disciplines. Four humans rated the LLM output – using an aggregate rating approach – in terms of (1) whether it met the CS learning standard, (2) whether it met the language arts learning standard, (3) whether it was equitable, and (4) its overall quality. Results: For Claude AI, 52% of the activities met language arts standards, 64% met CS standards, and the average quality rating was middling. For ChatGPT, 75% of the activities met language arts standards, 63% met CS standards, and the average quality rating was low. Virtually all activities from both LLMs were rated as neither actively promoting nor inhibiting equitable instruction. Discussion: Our results suggest that LLMs are not (yet) able to create appropriate learning activities from learning standards. The activities were generally not usable by classroom teachers without further elaboration and/or modification. There were also grammatical errors in the output, something not common with LLM-produced text. Further, standards in one or both disciplines were often not addressed, and the quality of the activities was often low. We conclude with recommendations for the use of LLMs in curriculum development in light of these findings. 
                        more » 
                        « less   
                    
                            
                            Opening a conversation on responsible environmental data science in the age of large language models
                        
                    
    
            Abstract The general public and scientific community alike are abuzz over the release of ChatGPT and GPT-4. Among many concerns being raised about the emergence and widespread use of tools based on large language models (LLMs) is the potential for them to propagate biases and inequities. We hope to open a conversation within the environmental data science community to encourage the circumspect and responsible use of LLMs. Here, we pose a series of questions aimed at fostering discussion and initiating a larger dialogue. To improve literacy on these tools, we provide background information on the LLMs that underpin tools like ChatGPT. We identify key areas in research and teaching in environmental data science where these tools may be applied, and discuss limitations to their use and points of concern. We also discuss ethical considerations surrounding the use of LLMs to ensure that as environmental data scientists, researchers, and instructors, we can make well-considered and informed choices about engagement with these tools. Our goal is to spark forward-looking discussion and research on how as a community we can responsibly integrate generative AI technologies into our work. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2125921
- PAR ID:
- 10548869
- Publisher / Repository:
- Cambridge University Press (CUP)
- Date Published:
- Journal Name:
- Environmental Data Science
- Volume:
- 3
- ISSN:
- 2634-4602
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Background and Context. This innovative practice full paper describes the development and implementation of a professional development (PD) opportunity for secondary teachers to learn about ChatGPT. Incorporating generative AI techniques from Large Language Models (LLMs) such as ChatGPT into educational environments offers unprecedented opportunities and challenges. Prior research has highlighted their potential to personalize feedback, assist in lesson planning, generate educational content, and reduce teachers' workload, alongside concerns such as academic integrity and student privacy. However, the rapid adoption of LLMs since ChatGPT's public release in late 2022 has left educators, particularly at the secondary level, with a lack of clear guidance on how LLMs work and can be effectively adopted. Objective. This study aims to introduce a comprehensive, free, and vetted ChatGPT course tailored for secondary teachers, with the objective of enhancing their technological competencies in LLMs and fostering innovative teaching practices. Method. We developed a five-session interactive course on ChatGPT capabilities, limitations, prompt-engineering techniques, ethical considerations, and strategies for incorporating ChatGPT into teaching. We introduced the course to six middle and high school teachers. Our curriculum emphasized active learning through peer discussions, hands-on activities, and project-based learning. We conducted pre- and post-course focus groups to determine the effectiveness of the course and the extent to which teachers' attitudes toward the use of LLMs in schools had changed. To identify trends in knowledge and attitudes, we asked teachers to complete feedback forms at the end of each of the five sessions. We performed a thematic analysis to classify teacher quotes from focus groups' transcripts as positive, negative, and neutral and calculated the ratio of positive to negative comments in the pre- and post-focus groups. We also analyzed their feedback on each individual session. Finally, we interviewed all participants five months after course completion to understand the longer-term impacts of the course. Findings. Our participants unanimously shared that all five of the sessions provided a deeper understanding of ChatGPT, featured enough opportunities for hands-on practice, and achieved their learning objectives. Our thematic analysis underlined that teachers gained a more positive and nuanced understanding of ChatGPT after the course. This change is evidenced quantitatively by the fact that quotes with positive connotations rose from 45% to 68% of the total number of positive and negative quotes. Participants shared that in the longer term, the course improved their professional development, understanding of ChatGPT, and teaching practices. Implications. This research underscores the effectiveness of active learning in professional development settings, particularly for technological innovations in computing like LLMs. Our findings suggest that introducing teachers to LLM tools through active learning can improve their work processes and give them a thorough and accurate understanding of how these tools work. By detailing our process and providing a model for similar initiatives, our work contributes to the broader discourse on teaching professional educators about computing and integrating emerging technologies in educational and professional development settings.more » « less
- 
            Introduction: Recent AI advances, particularly the introduction of large language models (LLMs), have expanded the capacity to automate various tasks, including the analysis of text. This capability may be especially helpful in education research, where lack of resources often hampers the ability to perform various kinds of analyses, particularly those requiring a high level of expertise in a domain and/or a large set of textual data. For instance, we recently coded approximately 10,000 state K-12 computer science standards, requiring over 200 hours of work by subject matter experts. If LLMs are capable of completing a task such as this, the savings in human resources would be immense. Research Questions: This study explores two research questions: (1) How do LLMs compare to humans in the performance of an education research task? and (2) What do errors in LLM performance on this task suggest about current LLM capabilities and limitations? Methodology: We used a random sample of state K-12 computer science standards. We compared the output of three LLMs – ChatGPT, Llama, and Claude – to the work of human subject matter experts in coding the relationship between each state standard and a set of national K-12 standards. Specifically, the LLMs and the humans determined whether each state standard was identical to, similar to, based on, or different from the national standards and (if it was not different) which national standard it resembled. Results: Each of the LLMs identified a different national standard than the subject matter expert in about half of instances. When the LLM identified the same standard, it usually categorized the type of relationship (i.e., identical to, similar to, based on) in the same way as the human expert. However, the LLMs sometimes misidentified ‘identical’ standards. Discussion: Our results suggest that LLMs are not currently capable of matching human performance on the task of classifying learning standards. The mis-identification of some state standards as identical to national standards – when they clearly were not – is an interesting error, given that traditional computing technologies can easily identify identical text. Similarly, some of the mismatches between the LLM and human performance indicate clear errors on the part of the LLMs. However, some of the mismatches are difficult to assess, given the ambiguity inherent in this task and the potential for human error. We conclude the paper with recommendations for the use of LLMs in education research based on these findings.more » « less
- 
            Great Power Brings Great Responsibility: Personalizing Conversational AI for Diverse Problem-SolversNewcomers onboarding to Open Source Software (OSS) projects face many challenges. Large Language Models (LLMs), like ChatGPT, have emerged as potential resources for answering questions and providing guidance, with many developers now turning to ChatGPT over traditional Q&A sites like Stack Overflow. Nonetheless, LLMs may carry biases in presenting information, which can be especially impactful for newcomers whose problem-solving styles may not be broadly represented. This raises important questions about the accessibility of AI-driven support for newcomers to OSS projects. This vision paper outlines the potential of adapting AI responses to various problem-solving styles to avoid privileging a particular subgroup. We discuss the potential of AI persona-based prompt engineering as a strategy for interacting with AI. This study invites further research to refine AI-based tools to better support contributions to OSS projects.more » « less
- 
            Generative AI tools, particularly those utilizing large language models (LLMs), are increasingly used in everyday contexts. While these tools enhance productivity and accessibility, little is known about how Deaf and Hard of Hearing (DHH) individuals engage with them or the challenges they face when using them. This paper presents a mixed-method study exploring how the DHH community uses Text AI tools like ChatGPT to reduce communication barriers and enhance information access. We surveyed 80 DHH participants and conducted interviews with 9 participants. Our findings reveal important benefits, such as eased communication and bridging Deaf and hearing cultures, alongside challenges like lack of American Sign Language (ASL) support and Deaf cultural understanding. We highlight unique usage patterns, propose inclusive design recommendations, and outline future research directions to improve Text AI accessibility for the DHH community.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    