Quantum mechanics is a subject rife with student conceptual difficulties. In order to study and devise better strategies for helping students overcome them, we need ways of assessing on a broad level how students are thinking. This is possible with the use of standardized, research-validated assessments like the Quantum Mechanics Concept Assessment (QMCA). These assessments are useful, but they lack rigorous population independence, and the question ordering cannot be rearranged without throwing into question the validity of the results. One way to overcome these two issues is to design the exam to be compatible with Rasch measurement theory which calibrates individual items and is capable of assessing item difficulty and person ability independently. In this paper, we present a Rasch analysis of the QMCA and discuss estimated item difficulties and person abilities, item and person fit to the Rasch model, and unidimensionality of the instrument. This work will lay the foundation for more robust and potentially generalizable assessments in the future.
more »
« less
Validity and Test-Length Reduction Strategies for Complex Assessments
Lengthy standardized assessments decrease instructional time while increasing concerns about student cognitive fatigue. This study presents a methodological approach for item reduction within a complex assessment setting using the Problem Solving Measure for Grade 6 (PSM6). Five item-reduction methods were utilized to reduce the number of items on the PSM6, and each shortened instrument was evaluated through validity evidence for test content, internal structure, and relationships to other variables. The two quantitative methods (Rasch model and point-biserial) resulted in the best psychometrically performing shortened assessments but were not representative of all content subdomains, while the three qualitative (content preservation) methods resulted in poor psychometrically performing assessments that retained all subdomains. Specifically, the ten-item Rasch and ten-item point-biserial shortened tests demonstrated the overall strongest validity evidence, but future research is needed to explore the psychometric performance of these versions in a new independent sample and the necessity for subdomain representation. Implications for the study provide a methodological framework for researchers to use and reduce the length of existing instruments while identifying how the various reduction strategies may sacrifice different information from the original instrument. Practitioners are encouraged to carefully examine to what extent their reduced instrument aligns with their pre-determined criteria.
more »
« less
- PAR ID:
- 10517401
- Editor(s):
- Smith, Richard
- Publisher / Repository:
- Journal of Applied Measurement
- Date Published:
- Journal Name:
- Journal of applied measurement
- ISSN:
- 1529-7713
- Subject(s) / Keyword(s):
- item reduction validity psychometric assessments Rasch problem solving
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Instrument development should adhere to the Standards (AERA et al., 2014). “Content oriented evidence of validation is at the heart of the [validation] process” (AERA et al., 2014, p.15) and is one of the five sources of validity evidence. The research question for this study is: What is the evidence related to test content for the three instruments called the PSM3, PSM4, and PSM5? The study’s purpose is to describe content validity evidence related to new problem-solving measures currently under development. We have previously published validity evidence for problem-solving measures (PSM6, PSM7, and PSM8) that address middle grades math standards (see Bostic & Sondergeld, 2015; Bostic, Sondergeld, Folger, & Kruse, 2017).more » « less
-
Determining the most appropriate method of scoring an assessment is based on multiple factors, including the intended use of results, the assessment's purpose, and time constraints. Both the dichotomous and partial credit models have their advantages, yet direct comparisons of assessment outcomes from each method are not typical with constructed response items. The present study compared the impact of both scoring methods on the internal structure and consequential validity of a middle-grades problem-solving assessment called the problem solving measure for grade six (PSM6). After being scored both ways, Rasch dichotomous and partial credit analyses indicated similarly strong psychometric findings across models. Student outcome measures on the PSM6, scored both dichotomously and with partial credit, demonstrated strong, positive, significant correlation. Similar demographic patterns were noted regardless of scoring method. Both scoring methods produced similar results, suggesting that either would be appropriate to use with the PSM6.more » « less
-
Abstract Determining the most appropriate method of scoring an assessment is based on multiple factors, including the intended use of results, the assessment's purpose, and time constraints. Both the dichotomous and partial credit models have their advantages, yet direct comparisons of assessment outcomes from each method are not typical with constructed response items. The present study compared the impact of both scoring methods on the internal structure and consequential validity of a middle‐grades problem‐solving assessment called the problem solving measure for grade six (PSM6). After being scored both ways, Rasch dichotomous and partial credit analyses indicated similarly strong psychometric findings across models. Student outcome measures on the PSM6, scored both dichotomously and with partial credit, demonstrated strong, positive, significant correlation. Similar demographic patterns were noted regardless of scoring method. Both scoring methods produced similar results, suggesting that either would be appropriate to use with the PSM6.more » « less
-
null (Ed.)We report on the development of a new instrument for measuring teachers' knowledge of language as an epistemic tool in science classes. Language is essential for science learning, as all learning requires the use of language to constitute one's own ideas and to engage with others' ideas. Teachers with knowledge of language as an epistemic tool can recognize the ways that language allows students to generate and validate knowledge for themselves, rather than to replicate canonical knowledge transmitted by other sources.We used a construct‐driven development approach with iterations of domain analysis, item revision, teacher feedback, expert review, and item piloting to address the content, substance, and structure aspects of validity. Data from 158 preservice and in‐service teachers on 27 preliminary items were collected. Findings from Rasch measurement modeling indicate a single dimension fits the items well and can distinguish teachers of higher and lower knowledge. We revised and selected 15 items for an updated instrument. This contributes to ongoing measurement projects and provides a potential instrument for future, broader use by the field to gauge teachers' knowledge of language as an epistemic tool.more » « less