skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Deconstruction of Holistic Rubrics into Analytic Rubrics for Large-Scale Assessments of Students’ Reasoning of Complex Science Concepts
Constructed responses can be used to assess the complexity of student thinking and can be evaluated using rubrics. The two most typical rubric types used are holistic and analytic. Holistic rubrics may be difficult to use with expert-level reasoning that has additive or overlapping language. In an attempt to unpack complexity in holistic rubrics at a large scale, we have developed a systematic approach called deconstruction. We define deconstruction as the process of converting a holistic rubric into defining individual conceptual components that can be used for analytic rubric development and application. These individual components can then be recombined into the holistic score which keeps true to the holistic rubric purpose, while maximizing the benefits and minimizing the shortcomings of each rubric type. This paper outlines the deconstruction process and presents a case study that shows defined concept definitions for a hierarchical holistic rubric developed for an undergraduate physiology-content reasoning context. These methods can be used as one way for assessment developers to unpack complex student reasoning, which may ultimately improve reliability and validation of assessments that are targeted at uncovering large-scale complex scientific reasoning.  more » « less
Award ID(s):
1660643
PAR ID:
10112967
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
Practical assessment, research & evaluation
Volume:
24
Issue:
7
ISSN:
1531-7714
Page Range / eLocation ID:
1-13
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Constructed responses can be used to assess the complexity of student thinking and can be evaluated using rubrics. The two most typical rubric types used are holistic and analytic. Holistic rubrics may be difficult to use with expert-level reasoning that has additive or overlapping language. In an attempt to unpack complexity in holistic rubrics at a large scale, we have developed a systematic approach called deconstruction. We define deconstruction as the process of converting a holistic rubric into defining individual conceptual components that can be used for analytic rubric development and application. These individual components can then be recombined into the holistic score which keeps true to the holistic rubric purpose, while maximizing the benefits and minimizing the shortcomings of each rubric type. This paper outlines the deconstruction process and presents a case study that shows defined concept definitions for a hierarchical holistic rubric developed for an undergraduate physiology-content reasoning context. These methods can be used as one way for assessment developers to unpack complex student reasoning, which may ultimately improve reliability and validation of assessments that are targeted at uncovering large-scale complex scientific reasoning. 
    more » « less
  2. The Framework for K-12 Science Education recognizes modeling as an essential practice for building deep understanding of science. Modeling assessments should measure the ability to integrate Disciplinary Core Ideas and Crosscutting Concepts. Machine learning (ML) has been utilized to score and provide feedback on open-ended Learning Progression (LP)-aligned assessments. Analytic rubrics have been shown to be easier to evaluate the validity of ML-based scores. A possible drawback of using analytic rubrics is the potential for oversimplification of integrated ideas. We demonstrate the deconstruction of a 3D holistic rubric for modeling assessments aligned LP for Physical Science. We describe deconstructing this rubric into analytic categories for ML training and to preserve its 3D nature. 
    more » « less
  3. null (Ed.)
    Abstract We systematically compared two coding approaches to generate training datasets for machine learning (ML): (i) a holistic approach based on learning progression levels and (ii) a dichotomous, analytic approach of multiple concepts in student reasoning, deconstructed from holistic rubrics. We evaluated four constructed response assessment items for undergraduate physiology, each targeting five levels of a developing flux learning progression in an ion context. Human-coded datasets were used to train two ML models: (i) an 8-classification algorithm ensemble implemented in the Constructed Response Classifier (CRC), and (ii) a single classification algorithm implemented in LightSide Researcher’s Workbench. Human coding agreement on approximately 700 student responses per item was high for both approaches with Cohen’s kappas ranging from 0.75 to 0.87 on holistic scoring and from 0.78 to 0.89 on analytic composite scoring. ML model performance varied across items and rubric type. For two items, training sets from both coding approaches produced similarly accurate ML models, with differences in Cohen’s kappa between machine and human scores of 0.002 and 0.041. For the other items, ML models trained with analytic coded responses and used for a composite score, achieved better performance as compared to using holistic scores for training, with increases in Cohen’s kappa of 0.043 and 0.117. These items used a more complex scenario involving movement of two ions. It may be that analytic coding is beneficial to unpacking this additional complexity. 
    more » « less
  4. Research has documented the presence of bias against women in hiring, including in academic science, technology, engineering, and mathematics (STEM). Hiring rubrics (also called criterion checklists, decision support tools, and evaluation tools) are widely recommended as a precise, cost-effective remedy to counteract hiring bias, despite a paucity of evidence that they actually work (see table S8). Our in-depth case study of rubric usage in faculty hiring in an academic engineering department in a very research-active university found that the rate of hiring women increased after the department deployed rubrics and used them to guide holistic discussions. Yet we also found evidence of substantial gender bias persisting in some rubric scoring categories and evaluators’ written comments. We do not recommend abandoning rubrics. Instead, we recommend a strategic and sociologically astute use of rubrics as a department self-study tool within the context of a holistic evaluation of semifinalist candidates. 
    more » « less
  5. The purpose of this work was to test the inter-rater reliability (IRR) of a rubric used to grade technical reports in a senior-level chemical engineering laboratory course that has multiple instructors that grade deliverables. The rubric consisted of fifteen constructs that provided students detailed guidance on instructor expectations with respect to the report sections, formatting and technical writing aspects such as audience, context and purpose. Four student reports from previous years were scored using the rubric, and IRR was assessed using a two-way mixed, consistency, average-measures intraclass correlation (ICC) for each construct. Then, the instructors met as a group to discuss their scoring and reasoning. Multiple revisions were made to the rubric based on instructor feedback and constructs rated by ICC as poor. When fair or poor constructs were combined, the ICCs improved. In addition, the overall score construct continued to be rated as excellent, indicating that while different instructors may have variation at the individual construct level, they evaluate the overall quality of the report consistently. A key learning from this process was the importance of the instructor discussion around their reasoning for the scores and the importance of an ‘instructor orientation’ involving discussion and practice using the rubrics in the case of multiple instructors or a change in instructors. The developed rubric has the potential for broad applicability to engineering laboratory courses with technical writing components and could be adapted for alternative styles of technical writing genre. 
    more » « less