Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available December 1, 2025
-
This paper explores the use of large language models (LLMs) to score and explain short-answer assessments in K-12 science. While existing methods can score more structured math and computer science assessments, they often do not provide explanations for the scores. Our study focuses on employing GPT-4 for automated assessment in middle school Earth Science, combining few-shot and active learning with chain-of-thought reasoning. Using a human-in-the-loop approach, we successfully score and provide meaningful explanations for formative assessment responses. A systematic analysis of our method's pros and cons sheds light on the potential for human-in-the-loop techniques to enhance automated grading for open-ended science assessments.more » « less
-
Clarke-Midura, J; Kollar, I; Gu, X; D’Angelo, C (Ed.)In collaborative problem-solving (CPS), students work together to solve problems using their collective knowledge and social interactions to understand the problem and progress towards a solution. This study focuses on how students engage in CPS while working in pairs in a STEM+C (Science, Technology, Engineering, Mathematics, and Computing) environment that involves open-ended computational modeling tasks. Specifically, we study how groups with different prior knowledge in physics and computing concepts differ in their information pooling and consensus-building behaviors. In addition, we examine how these differences impact the development of their shared understanding and learning. Our study consisted of a high school kinematics curriculum with 1D and 2D modeling tasks. Using an exploratory approach, we performed in-depth case studies to analyze the behaviors of groups with different prior knowledge distributions across these tasks. We identify effective information pooling and consensus-building behaviors in addition to difficulties students faced when developing a shared understanding of physics and computing concepts.more » « less
-
Computational models (CMs) offer pre-college students opportunities to integrate STEM disciplines with computational thinking in ways that reflect authentic STEM practice. However, not all STEM teachers and students are prepared to teach or learn programming skills required to construct CMs. To broaden participation in computing, we propose instructional approaches that integrate STEM with CMs without requiring students to program, thereby alleviating challenges associated with learning how to program.more » « less
-
Computational models (CMs) offer pre-college students opportunities to integrate STEM disciplines with computational thinking (CT) in ways that reflect authentic STEM practice. However, not all STEM teachers and students are prepared to teach or learn programming skills required to construct CMs. To help broaden participation in computing and reduce the potentially prohibitive demands of learning programming, we propose alternate versions of computational modeling that require low or no programming. These versions rely on code comprehension and evaluation of given code and simulations instead of code creation. We present results from a pilot study that explores student engagement with CT practices and student challenges in three types of computational modeling activities.more » « less
-
This research explores a novel human-in-the-loop approach that goes beyond traditional prompt engineering approaches to harness Large Language Models (LLMs) with chain-of-thought prompting for grading middle school students’ short answer formative assessments in science and generating useful feedback. While recent efforts have successfully applied LLMs and generative AI to automatically grade assignments in secondary classrooms, the focus has primarily been on providing scores for mathematical and programming problems with little work targeting the generation of actionable insight from the student responses. This paper addresses these limitations by exploring a human-in-the-loop approach to make the process more intuitive and more effective. By incorporating the expertise of educators, this approach seeks to bridge the gap between automated assessment and meaningful educational support in the context of science education for middle school students. We have conducted a preliminary user study, which suggests that (1) co-created models improve the performance of formative feedback generation, and (2) educator insight can be integrated at multiple steps in the process to inform what goes into the model and what comes out. Our findings suggest that in-context learning and human-in-the-loop approaches may provide a scalable approach to automated grading, where the performance of the automated LLM-based grader continually improves over time, while also providing actionable feedback that can support students’ open-ended science learning.more » « less
-
Martin Fred; Norouzi, Narges; Rosenthal, Stephanie (Ed.)This paper examines the use of LLMs to support the grading and explanation of short-answer formative assessments in K12 science topics. While significant work has been done on programmatically scoring well-structured student assessments in math and computer science, many of these approaches produce a numerical score and stop short of providing teachers and students with explanations for the assigned scores. In this paper, we investigate few-shot, in-context learning with chain-of-thought reasoning and active learning using GPT-4 for automated assessment of students’ answers in a middle school Earth Science curriculum. Our findings from this human-in-the-loop approach demonstrate success in scoring formative assessment responses and in providing meaningful explanations for the assigned score. We then perform a systematic analysis of the advantages and limitations of our approach. This research provides insight into how we can use human-in-the-loop methods for the continual improvement of automated grading for open-ended science assessments.more » « less
-
Grieff, S. (Ed.)Recently there has been increased development of curriculum and tools that integrate computing (C) into Science, Technology, Engineering, and Math (STEM) learning environments. These environments serve as a catalyst for authentic collaborative problem-solving (CPS) and help students synergistically learn STEM+C content. In this work, we analyzed students’ collaborative problem-solving behaviors as they worked in pairs to construct computational models in kinematics. We leveraged social measures, such as equity and turn-taking, along with a domain-specific measure that quantifies the synergistic interleaving of science and computing concepts in the students’ dialogue to gain a deeper understanding of the relationship between students’ collaborative behaviors and their ability to complete a STEM+C computational modeling task. Our results extend past findings identifying the importance of synergistic dialogue and suggest that while equitable discourse is important for overall task success, fluctuations in equity and turn-taking at the segment level may not have an impact on segment-level task performance. To better understand students’ segment-level behaviors, we identified and characterized groups’ planning, enacting, and reflection behaviors along with monitoring processes they employed to check their progress as they constructed their models. Leveraging Markov Chain (MC) analysis, we identified differences in high- and low-performing groups’ transitions between these phases of students’ activities. We then compared the synergistic, turn-taking, and equity measures for these groups for each one of the MC model states to gain a deeper understanding of how these collaboration behaviors relate to their computational modeling performance. We believe that characterizing differences in collaborative problem-solving behaviors allows us to gain a better understanding of the difficulties students face as they work on their computational modeling tasks.more » « less
An official website of the United States government

Full Text Available