skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, May 2 until 12:00 AM ET on Saturday, May 3 due to maintenance. We apologize for the inconvenience.


Title: The Ideal versus the Real Deal in Assessment of Physics Lab Report Writing
Effective writing is important for communicating science ideas, and for writing-to-learn in science. This paper investigates lab reports from a large-enrollment college physics course that integrates scientific reasoning and science writing. While analytic rubrics have been shown to define expectations more clearly for students, and to improve reliability of assessment, there has been little investigation of how well analytic rubrics serve students and instructors in large-enrollment science classes. Unsurprisingly, we found that grades administered by teaching assistants (TAs) do not correlate with reliable post-hoc assessments from trained raters. More important, we identified lost learning opportunities for students, and misinformation for instructors about students’ progress. We believe our methodology to achieve post-hoc reliability is straightforward enough to be used in classrooms. A key element is the development of finer-grained rubrics for grading that are aligned with the rubrics provided to students to define expectations, but which reduce subjectivity of judgements and grading time. We conclude that the use of dual rubrics, one to elicit independent reasoning from students and one to clarify grading criteria, could improve reliability and accountability of lab report assessment, which could in turn elevate the role of lab reports in the instruction of scientific inquiry.  more » « less
Award ID(s):
2110334
PAR ID:
10541049
Author(s) / Creator(s):
; ; ;
Editor(s):
James, C
Publisher / Repository:
Services for Science and Education - United Kingdom
Date Published:
Journal Name:
European journal of applied sciences
Volume:
11
Issue:
2
ISSN:
2634-9221
Page Range / eLocation ID:
626-644
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. With an increasing focus in STEM education on critical thinking skills, science writing plays an ever more important role. A recently published dataset of two sets of college level lab reports from an inquiry-based physics curriculum relies on analytic assessment rubrics that utilize multiple dimensions, specifying subject matter knowledge and general components of good explanations. Each analytic dimension is assessed on a 6-point scale, to provide detailed feedback to students that can help them improve their science writing skills. Manual assessment can be slow, and difficult to calibrate for consistency across all students in large enrollment courses with many sections. While much work exists on automated assessment of open-ended questions in STEM subjects, there has been far less work on long-form writing such as lab reports. We present an end-to-end neural architecture that has separate verifier and assessment modules, inspired by approaches to Open Domain Question Answering (OpenQA). VerAs first verifies whether a report contains any content relevant to a given rubric dimension, and if so, assesses the relevant sentences. On the lab reports, VerAs outperforms multiple baselines based on OpenQA systems or Automated Essay Scoring (AES). VerAs also performs well on an analytic rubric for middle school physics essays. 
    more » « less
  2. A challenge instructors face is developing and accurately assessing technical communication skills to ensure students can apply and transfer the skills from the academic context into the context of engineering practice. By intentionally balancing teaching transferrable communication skills relevant to engineering practice and evaluating student understanding, engineering educators can foster competence and prepare students for the expectations of their professional careers. This study addresses two questions: (1) how can chemical engineering instructors reliably and consistently assess student communication skills, and (2) are instructor expectations aligned with those of practicing engineers? The use of well-designed rubrics is important for setting clear expectations for students, providing constructive feedback, and in team taught courses, grading consistently. This study discusses how a rubric for assessing technical communication skills in senior-level chemical engineering laboratory reports was validated and demonstrated reliability across five chemical engineering instructors. Additionally, five industry partners evaluated student reports for comparison to instructor rubric scores. Expectations and perceptions of the quality of student work align between instructors and practicing engineers, but practicing engineers prioritized safety and abstract clarity, while instructors prioritized the students’ abilities to interpret results and draw conclusions. 
    more » « less
  3. The purpose of this work was to test the inter-rater reliability (IRR) of a rubric used to grade technical reports in a senior-level chemical engineering laboratory course that has multiple instructors that grade deliverables. The rubric consisted of fifteen constructs that provided students detailed guidance on instructor expectations with respect to the report sections, formatting and technical writing aspects such as audience, context and purpose. Four student reports from previous years were scored using the rubric, and IRR was assessed using a two-way mixed, consistency, average-measures intraclass correlation (ICC) for each construct. Then, the instructors met as a group to discuss their scoring and reasoning. Multiple revisions were made to the rubric based on instructor feedback and constructs rated by ICC as poor. When fair or poor constructs were combined, the ICCs improved. In addition, the overall score construct continued to be rated as excellent, indicating that while different instructors may have variation at the individual construct level, they evaluate the overall quality of the report consistently. A key learning from this process was the importance of the instructor discussion around their reasoning for the scores and the importance of an ‘instructor orientation’ involving discussion and practice using the rubrics in the case of multiple instructors or a change in instructors. The developed rubric has the potential for broad applicability to engineering laboratory courses with technical writing components and could be adapted for alternative styles of technical writing genre. 
    more » « less
  4. Constructed responses can be used to assess the complexity of student thinking and can be evaluated using rubrics. The two most typical rubric types used are holistic and analytic. Holistic rubrics may be difficult to use with expert-level reasoning that has additive or overlapping language. In an attempt to unpack complexity in holistic rubrics at a large scale, we have developed a systematic approach called deconstruction. We define deconstruction as the process of converting a holistic rubric into defining individual conceptual components that can be used for analytic rubric development and application. These individual components can then be recombined into the holistic score which keeps true to the holistic rubric purpose, while maximizing the benefits and minimizing the shortcomings of each rubric type. This paper outlines the deconstruction process and presents a case study that shows defined concept definitions for a hierarchical holistic rubric developed for an undergraduate physiology-content reasoning context. These methods can be used as one way for assessment developers to unpack complex student reasoning, which may ultimately improve reliability and validation of assessments that are targeted at uncovering large-scale complex scientific reasoning. 
    more » « less
  5. Constructed responses can be used to assess the complexity of student thinking and can be evaluated using rubrics. The two most typical rubric types used are holistic and analytic. Holistic rubrics may be difficult to use with expert-level reasoning that has additive or overlapping language. In an attempt to unpack complexity in holistic rubrics at a large scale, we have developed a systematic approach called deconstruction. We define deconstruction as the process of converting a holistic rubric into defining individual conceptual components that can be used for analytic rubric development and application. These individual components can then be recombined into the holistic score which keeps true to the holistic rubric purpose, while maximizing the benefits and minimizing the shortcomings of each rubric type. This paper outlines the deconstruction process and presents a case study that shows defined concept definitions for a hierarchical holistic rubric developed for an undergraduate physiology-content reasoning context. These methods can be used as one way for assessment developers to unpack complex student reasoning, which may ultimately improve reliability and validation of assessments that are targeted at uncovering large-scale complex scientific reasoning. 
    more » « less