skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Examining how using dichotomous and partial credit scoring models influence sixth‐grade mathematical problem‐solving assessment outcomes
Abstract Determining the most appropriate method of scoring an assessment is based on multiple factors, including the intended use of results, the assessment's purpose, and time constraints. Both the dichotomous and partial credit models have their advantages, yet direct comparisons of assessment outcomes from each method are not typical with constructed response items. The present study compared the impact of both scoring methods on the internal structure and consequential validity of a middle‐grades problem‐solving assessment called the problem solving measure for grade six (PSM6). After being scored both ways, Rasch dichotomous and partial credit analyses indicated similarly strong psychometric findings across models. Student outcome measures on the PSM6, scored both dichotomously and with partial credit, demonstrated strong, positive, significant correlation. Similar demographic patterns were noted regardless of scoring method. Both scoring methods produced similar results, suggesting that either would be appropriate to use with the PSM6.  more » « less
Award ID(s):
2201165 1720646
PAR ID:
10409573
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
School Science and Mathematics
Volume:
123
Issue:
2
ISSN:
0036-6803
Page Range / eLocation ID:
p. 54-67
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Determining the most appropriate method of scoring an assessment is based on multiple factors, including the intended use of results, the assessment's purpose, and time constraints. Both the dichotomous and partial credit models have their advantages, yet direct comparisons of assessment outcomes from each method are not typical with constructed response items. The present study compared the impact of both scoring methods on the internal structure and consequential validity of a middle-grades problem-solving assessment called the problem solving measure for grade six (PSM6). After being scored both ways, Rasch dichotomous and partial credit analyses indicated similarly strong psychometric findings across models. Student outcome measures on the PSM6, scored both dichotomously and with partial credit, demonstrated strong, positive, significant correlation. Similar demographic patterns were noted regardless of scoring method. Both scoring methods produced similar results, suggesting that either would be appropriate to use with the PSM6. 
    more » « less
  2. Determining the most appropriate method of scoring an assessment is basedon multiple factors, including the intended use of results, the assessment's pur-pose, and time constraints. Both the dichotomous and partial credit modelshave their advantages, yet direct comparisons of assessment outcomes fromeach method are not typical with constructed response items. The presentstudy compared the impact of both scoring methods on the internal structureand consequential validity of a middle-grades problem-solving assessmentcalled the problem solving measure for grade six (PSM6). After being scoredboth ways, Rasch dichotomous and partial credit analyses indicated similarlystrong psychometric findings across models. Student outcome measures on thePSM6, scored both dichotomously and with partial credit, demonstratedstrong, positive, significant correlation. Similar demographic patterns werenoted regardless of scoring method. Both scoring methods produced similarresults, suggesting that either would be appropriate to use with the PSM6. 
    more » « less
  3. Abstract Computational modeling of protein–DNA complex structures has important implications in biomedical applications such as structure‐based, computer aided drug design. A key step in developing methods for accurate modeling of protein–DNA complexes is similarity assessment between models and their reference complex structures. Existing methods primarily rely on distance‐based metrics and generally do not consider important functional features of the complexes, such as interface hydrogen bonds that are critical to specific protein–DNA interactions. Here, we present a new scoring function, ComparePD, which takes interface hydrogen bond energy and strength into account besides the distance‐based metrics for accurate similarity measure of protein–DNA complexes. ComparePD was tested on two datasets of computational models of protein–DNA complexes generated using docking (classified as easy, intermediate, and difficult cases) and homology modeling methods. The results were compared with PDDockQ, a modified version of DockQ tailored for protein–DNA complexes, as well as the metrics employed by the community‐wide experiment CAPRI (Critical Assessment of PRedicted Interactions). We demonstrated that ComparePD provides an improved similarity measure over PDDockQ and the CAPRI classification method by considering both conformational similarity and functional importance of the complex interface. ComparePD identified more meaningful models as compared to PDDockQ for all the cases having different top models between ComparePD and PDDockQ except for one intermediate docking case. 
    more » « less
  4. Recent years have seen a movement within the research-based assessment development community towards item formats that go beyond simple multiple-choice formats. Some have moved towards free-response questions, particularly at the upper-division level; however, free-response items have the constraint that they must be scored by hand. To avoid this limitation, some assessment developers have moved toward formats that maintain the closed-response format, while still providing more nuanced insight into student reasoning. One such format is known as coupled, multiple response (CMR). This format pairs multiple-choice and multiple-response formats to allow students to both commit to an answer in addition to selecting options that correspond with their reasoning. In addition to being machine-scorable, this format allows for more nuanced scoring than simple right or wrong. However, such nuanced scoring presents a potential challenge with respect to utilizing certain testing theories to construct validity arguments for the assessment. In particular, Item Response Theory (IRT) models often assume dichotomously scored items. While polytomous IRT models do exist, each brings with it certain constraints and limitations. Here, we will explore multiple IRT models and scoring schema using data from an existing CMR test, with the goal of providing guidance and insight for possible methods for simultaneously leveraging the affordances of both the CMR format and IRT models in the context of constructing validity arguments for research-based assessments. 
    more » « less
  5. Abstract Problem solving is a central focus of mathematics teaching and learning. If teachers are expected to support students' problem‐solving development, then it reasons that teachers should also be able to solve problems aligned to grade level content standards. The purpose of this validation study is twofold: (1) to present evidence supporting the use of the Problem Solving Measures Grades 3–5 with preservice teachers (PSTs), and (2) to examine PSTs' abilities to solve problems aligned to grades 3–5 academic content standards. This study used Rasch measurement techniques to support psychometric analysis of the Problem Solving Measures when used with PSTs. Results indicate the Problem Solving Measures are appropriate for use with PSTs, and PSTs' performance on the Problem Solving Measures differed between first‐year PSTs and end‐of‐program PSTs. Implications include program evaluation and the potential benefits of using K‐12 student‐level assessments as measures of PSTs' content knowledge. 
    more » « less