skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Developing Rubrics for AI Scoring of NGSS Learning Progression-based Scientific Models
The Framework for K-12 Science Education recognizes modeling as an essential practice for building deep understanding of science. Modeling assessments should measure the ability to integrate Disciplinary Core Ideas and Crosscutting Concepts. Machine learning (ML) has been utilized to score and provide feedback on open-ended Learning Progression (LP)-aligned assessments. Analytic rubrics have been shown to be easier to evaluate the validity of ML-based scores. A possible drawback of using analytic rubrics is the potential for oversimplification of integrated ideas. We demonstrate the deconstruction of a 3D holistic rubric for modeling assessments aligned LP for Physical Science. We describe deconstructing this rubric into analytic categories for ML training and to preserve its 3D nature.  more » « less
Award ID(s):
2200757
PAR ID:
10534124
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
American Educational Research Association
Date Published:
Format(s):
Medium: X
Location:
Philadelphia, PA
Sponsoring Org:
National Science Foundation
More Like this
  1. James, C (Ed.)
    Effective writing is important for communicating science ideas, and for writing-to-learn in science. This paper investigates lab reports from a large-enrollment college physics course that integrates scientific reasoning and science writing. While analytic rubrics have been shown to define expectations more clearly for students, and to improve reliability of assessment, there has been little investigation of how well analytic rubrics serve students and instructors in large-enrollment science classes. Unsurprisingly, we found that grades administered by teaching assistants (TAs) do not correlate with reliable post-hoc assessments from trained raters. More important, we identified lost learning opportunities for students, and misinformation for instructors about students’ progress. We believe our methodology to achieve post-hoc reliability is straightforward enough to be used in classrooms. A key element is the development of finer-grained rubrics for grading that are aligned with the rubrics provided to students to define expectations, but which reduce subjectivity of judgements and grading time. We conclude that the use of dual rubrics, one to elicit independent reasoning from students and one to clarify grading criteria, could improve reliability and accountability of lab report assessment, which could in turn elevate the role of lab reports in the instruction of scientific inquiry. 
    more » « less
  2. Abstract We discuss transforming STEM education using three aspects: learning progressions (LPs), constructed response performance assessments, and artificial intelligence (AI). Using LPs to inform instruction, curriculum, and assessment design helps foster students’ ability to apply content and practices to explain phenomena, which reflects deeper science understanding. To measure the progress along these LPs, performance assessments combining elements of disciplinary ideas, crosscutting concepts and practices are needed. However, these tasks are time-consuming and expensive to score and provide feedback for. Artificial intelligence (AI) allows to validate the LPs and evaluate performance assessments for many students quickly and efficiently. The evaluation provides a report describing student progress along LP and the supports needed to attain a higher LP level. We suggest using unsupervised, semi-supervised ML and generative AI (GAI) at early LP validation stages to identify relevant proficiency patterns and start building an LP. We further suggest employing supervised ML and GAI for developing targeted LP-aligned performance assessment for more accurate performance diagnosis at advanced LP validation stages. Finally, we discuss employing AI for designing automatic feedback systems for providing personalized feedback to students and helping teachers implement LP-based learning. We discuss the challenges of realizing these tasks and propose future research avenues. 
    more » « less
  3. Constructed responses can be used to assess the complexity of student thinking and can be evaluated using rubrics. The two most typical rubric types used are holistic and analytic. Holistic rubrics may be difficult to use with expert-level reasoning that has additive or overlapping language. In an attempt to unpack complexity in holistic rubrics at a large scale, we have developed a systematic approach called deconstruction. We define deconstruction as the process of converting a holistic rubric into defining individual conceptual components that can be used for analytic rubric development and application. These individual components can then be recombined into the holistic score which keeps true to the holistic rubric purpose, while maximizing the benefits and minimizing the shortcomings of each rubric type. This paper outlines the deconstruction process and presents a case study that shows defined concept definitions for a hierarchical holistic rubric developed for an undergraduate physiology-content reasoning context. These methods can be used as one way for assessment developers to unpack complex student reasoning, which may ultimately improve reliability and validation of assessments that are targeted at uncovering large-scale complex scientific reasoning. 
    more » « less
  4. Constructed responses can be used to assess the complexity of student thinking and can be evaluated using rubrics. The two most typical rubric types used are holistic and analytic. Holistic rubrics may be difficult to use with expert-level reasoning that has additive or overlapping language. In an attempt to unpack complexity in holistic rubrics at a large scale, we have developed a systematic approach called deconstruction. We define deconstruction as the process of converting a holistic rubric into defining individual conceptual components that can be used for analytic rubric development and application. These individual components can then be recombined into the holistic score which keeps true to the holistic rubric purpose, while maximizing the benefits and minimizing the shortcomings of each rubric type. This paper outlines the deconstruction process and presents a case study that shows defined concept definitions for a hierarchical holistic rubric developed for an undergraduate physiology-content reasoning context. These methods can be used as one way for assessment developers to unpack complex student reasoning, which may ultimately improve reliability and validation of assessments that are targeted at uncovering large-scale complex scientific reasoning. 
    more » « less
  5. null (Ed.)
    This brief report describes the conception, development, and use of a rubric in evaluating the feasibility of a new program. The evaluators searched for a meta-analytic tool to help organize ideas about what data to collect, and why, in order to create a detailed story of feasibility of implementation for the client. The main advantage of using the rubric-based tool is that it lays out key evaluative criteria that are defined as concretely as possible. The article gives a brief overview of the literature on the use of rubrics in evaluation, illustrates the use of a feasibility of implementation rubric as a tool for development, analysis, and reporting, and concludes with recommendations emergent from the use of the rubric. 
    more » « less