skip to main content


This content will become publicly available on July 15, 2025

Title: How well can you articulate that idea? Insights from automated formative assessment
Automated methods are becoming increasingly used to support formative feedback on students’ science explanation writing. Most of this work addresses students’ responses to short answer questions. We investigate automated feedback on students’ science explanation essays, which discuss multiple ideas. Feedback is based on a rubric that identifies the main ideas students are prompted to include in explanatory essays about the physics of energy and mass. We have found that students revisions generally improve their essays. Here, we focus on two factors that affect the accuracy of the automated feedback. First, learned representations of the six main ideas in the rubric differ with respect to their distinctiveness from each other, and therefore the ability of automated methods to identify them in student essays. Second, sometimes a student’s statement lacks sufficient clarity for the automated tool to associate it more strongly with one of the main ideas above all others.  more » « less
Award ID(s):
2010483
PAR ID:
10515202
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
International Conference on Artificial Intelligence in Education 2024
Date Published:
Journal Name:
In the International Conference on Artificial Intelligence in Education 2024 Proceedings
Subject(s) / Keyword(s):
Automated Essay Feedback · Student Writing Clarity
Format(s):
Medium: X
Location:
Recife, Brazil
Sponsoring Org:
National Science Foundation
More Like this
  1. Writing scientific explanations is a core practice in science. However, students find it difficult to write coherent scientific explanations. Additionally, teachers find it challenging to provide real-time feedback on students’ essays. In this study, we discuss how PyrEval, an NLP technology, was used to automatically assess students’ essays and provide feedback. We found that students explained more key ideas in their essays after the automated assessment and feedback. However, there were issues with the automated assessments as well as students’ understanding of the feedback and revising their essays. 
    more » « less
  2. Hoadley, C ; Wang, XC (Ed.)
    Eighth grade students received automated feedback from PyrEval - an NLP tool - about their science essays. We examined essay quality change when revised. Regardless of prior physics knowledge, essay quality improved. Grounded in literature on AI explainability and trust in automated feedback, we also examined which PyrEval explanation predicted essay quality change. Essay quality improvement was predicted by high- and medium-accuracy feedback. 
    more » « less
  3. The Next Generation Science Standards (NGSS) emphasize integrating three dimensions of science learning: disciplinary core ideas, cross-cutting concepts, and science and engineering practices. In this study, we develop formative assessments that measure student understanding of the integration of these three dimensions along with automated scoring methods that distinguish among them. The formative assessments allow students to express their emerging ideas while also capturing progress in integrating core ideas, cross-cutting concepts, and practices. We describe how item and rubric design can work in concert with an automated scoring system to independently score science explanations from multiple perspectives. We describe item design considerations and provide validity evidence for the automated scores. 
    more » « less
  4. With an increasing focus in STEM education on critical thinking skills, science writing plays an ever more important role. A recently published dataset of two sets of college level lab reports from an inquiry-based physics curriculum relies on analytic assessment rubrics that utilize multiple dimensions, specifying subject matter knowledge and general components of good explanations. Each analytic dimension is assessed on a 6-point scale, to provide detailed feedback to students that can help them improve their science writing skills. Manual assessment can be slow, and difficult to calibrate for consistency across all students in large enrollment courses with many sections. While much work exists on automated assessment of open-ended questions in STEM subjects, there has been far less work on long-form writing such as lab reports. We present an end-to-end neural architecture that has separate verifier and assessment modules, inspired by approaches to Open Domain Question Answering (OpenQA). VerAs first verifies whether a report contains any content relevant to a given rubric dimension, and if so, assesses the relevant sentences. On the lab reports, VerAs outperforms multiple baselines based on OpenQA systems or Automated Essay Scoring (AES). VerAs also performs well on an analytic rubric for middle school physics essays. 
    more » « less
  5. Science writing skills depend on a student’s ability to co-ordinate conceptual understanding of science with the ability to articulate ideas independently, and to distinguish between gradations of importance in ideas. Real-time scaffolding of student writing during and immediately after the writing process could ease the cognitive burden of learning to co-ordinate these skills and enhance student learning of science. This paper presents a design process for automated support of real-time scaffolding of middle school students’ science explanations. We describe our adaptation of an existing tool for automatic content assessment to align more closely with a rubric, and our reliance on data mining of historical examples of middle school science writing. On a reserved test set of semi-synthetic examples of science explanations, the modified tool demonstrated high correlation with the manual rubric. We conclude the tool can support a wide range of design options for customized student feedback in real time. 
    more » « less