skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 2010483

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. This study is part of a larger research project aimed at developing and implementing an NLP-enabled AI feedback tool called PyrEval to support middle school students’ science explanation writing. We explored how human-AI integrated classrooms can invite students to harness AI tools while still being agentic learners. Building on theory of new materialism with posthumanist perspectives, we examined teacher framing to see how the nature of PyrEval was communicated, thereby orienting students to partner with or rely on PyrEval. We analyzed one teacher’s talk in multiple classrooms as well as that of students in small groups. We found student agency was fostered through teacher framing of (a) PyrEval as a non-neutral actor and a co-investigator and (b) students’ participation as an author and their understanding of the nature of PyrEval as core task and purpose. Findings and implications are discussed. 
    more » « less
    Free, publicly-accessible full text available July 9, 2026
  2. This study is part of a larger research project aimed at developing and implementing an NLP-enabled AI feedback tool called PyrEval to support middle school students’ science explanation writing. We explored how human-AI integrated classrooms can invite students to harness AI tools while still being agentic learners. Building on theory of new materialism with posthumanist perspectives, we examined teacher framing to see how the nature of PyrEval was communicated, thereby orienting students to partner with or rely on PyrEval. We analyzed one teacher’s talk in multiple classrooms as well as that of students in small groups. We found student agency was fostered through teacher framing of (a) PyrEval as a non-neutral actor and a co-investigator and (b) students’ participation as an author and their understanding of the nature of PyrEval as core task and purpose. Findings and implications are discussed. 
    more » « less
    Free, publicly-accessible full text available July 9, 2026
  3. Factors influencing students' perceptions of automated feedback and their impact on revision. 
    more » « less
    Free, publicly-accessible full text available July 9, 2026
  4. Automated feedback can provide students with timely information about their writing, but students' willingness to engage meaningfully with the feedback to revise their writing may be influenced by their perceptions of its usefulness. We explored the factors that may have influenced 339, 8th-grade students’ perceptions of receiving automated feedback on their writing and whether their perceptions impacted their revisions and writing improvement. Using HLM and logistic regression analyses, we found that: 1) students with more positive perceptions of the automated feedback made revisions that resulted in significant improvements in their writing, and 2) students who received feedback indicating they included more important ideas in their essays had significantly higher perceptions of the usefulness of the feedback, but were significantly less likely to engage in substantive revisions. Implications and the importance of helping students evaluate and reflect on the feedback to make substantive revisions, no matter their initial feedback, are discussed 
    more » « less
    Free, publicly-accessible full text available June 9, 2026
  5. As use of artificial intelligence (AI) has increased, concerns about AI bias and discrimination have been growing. This paper discusses an application called PyrEval in which natural language processing (NLP) was used to automate assessment and pro- vide feedback on middle school science writing with- out linguistic discrimination. Linguistic discrimination in this study was operationalized as unfair assess- ment of scientific essays based on writing features that are not considered normative such as subject- verb disagreement. Such unfair assessment is espe- cially problematic when the purpose of assessment is not assessing English writing but rather assessing the content of scientific explanations. PyrEval was implemented in middle school science classrooms. Students explained their roller coaster design by stat- ing relationships among such science concepts as potential energy, kinetic energy and law of conser- vation of energy. Initial and revised versions of sci- entific essays written by 307 eighth- grade students were analyzed. Our manual and NLP assessment comparison analysis showed that PyrEval did not pe- nalize student essays that contained non-normative writing features. Repeated measures ANOVAs and GLMM analysis results revealed that essay quality significantly improved from initial to revised essays after receiving the NLP feedback, regardless of non- normative writing features. Findings and implications are discussed. 
    more » « less
    Free, publicly-accessible full text available May 25, 2026
  6. As use of artificial intelligence (AI) has increased, concerns about AI bias and discrimination have been growing. This paper discusses an application called PyrEval in which natural language processing (NLP) was used to automate assessment and pro- vide feedback on middle school science writing with- out linguistic discrimination. Linguistic discrimination in this study was operationalized as unfair assess- ment of scientific essays based on writing features that are not considered normative such as subject- verb disagreement. Such unfair assessment is espe- cially problematic when the purpose of assessment is not assessing English writing but rather assessing the content of scientific explanations. PyrEval was implemented in middle school science classrooms. Students explained their roller coaster design by stat- ing relationships among such science concepts as potential energy, kinetic energy and law of conser- vation of energy. Initial and revised versions of sci- entific essays written by 307 eighth- grade students were analyzed. Our manual and NLP assessment comparison analysis showed that PyrEval did not pe- nalize student essays that contained non-normative writing features. Repeated measures ANOVAs and GLMM analysis results revealed that essay quality significantly improved from initial to revised essays after receiving the NLP feedback, regardless of non- normative writing features. Findings and implications are discussed. 
    more » « less
    Free, publicly-accessible full text available May 25, 2026
  7. Automated methods are becoming increasingly used to support formative feedback on students’ science explanation writing. Most of this work addresses students’ responses to short answer questions. We investigate automated feedback on students’ science explanation essays, which discuss multiple ideas. Feedback is based on a rubric that identifies the main ideas students are prompted to include in explanatory essays about the physics of energy and mass. We have found that students revisions generally improve their essays. Here, we focus on two factors that affect the accuracy of the automated feedback. First, learned representations of the six main ideas in the rubric differ with respect to their distinctiveness from each other, and therefore the ability of automated methods to identify them in student essays. Second, sometimes a student’s statement lacks sufficient clarity for the automated tool to associate it more strongly with one of the main ideas above all others. 
    more » « less
  8. With an increasing focus in STEM education on critical thinking skills, science writing plays an ever more important role. A recently published dataset of two sets of college level lab reports from an inquiry-based physics curriculum relies on analytic assessment rubrics that utilize multiple dimensions, specifying subject matter knowledge and general components of good explanations. Each analytic dimension is assessed on a 6-point scale, to provide detailed feedback to students that can help them improve their science writing skills. Manual assessment can be slow, and difficult to calibrate for consistency across all students in large enrollment courses with many sections. While much work exists on automated assessment of open-ended questions in STEM subjects, there has been far less work on long-form writing such as lab reports. We present an end-to-end neural architecture that has separate verifier and assessment modules, inspired by approaches to Open Domain Question Answering (OpenQA). VerAs first verifies whether a report contains any content relevant to a given rubric dimension, and if so, assesses the relevant sentences. On the lab reports, VerAs outperforms multiple baselines based on OpenQA systems or Automated Essay Scoring (AES). VerAs also performs well on an analytic rubric for middle school physics essays. 
    more » « less
  9. Automated writing evaluation (AWE) systems automatically assess and provide students with feedback on their writing. Despite learning benefits, students may not effectively interpret and utilize AI-generated feedback, thereby not maximizing their learning outcomes. A closely related issue is the accuracy of the systems, that students may not understand, are not perfect. Our study investigates whether students differentially addressed false positive and false negative AI-generated feedback errors on their science essays. We found that students addressed nearly all the false negative feedback; however, they addressed less than one-fourth of the false positive feedback. The odds of addressing a false positive feedback was 99% lower than addressing a false negative feedback, representing significant missed opportunities for revision and learning. We discuss the implications of these findings in the context of students’ learning. 
    more » « less
  10. Hoadley, C; Wang, XC (Ed.)
    Eighth grade students received automated feedback from PyrEval - an NLP tool - about their science essays. We examined essay quality change when revised. Regardless of prior physics knowledge, essay quality improved. Grounded in literature on AI explainability and trust in automated feedback, we also examined which PyrEval explanation predicted essay quality change. Essay quality improvement was predicted by high- and medium-accuracy feedback. 
    more » « less