Abstract Complex interactive test items are becoming more widely used in assessments. Being computer-administered, assessments using interactive items allow logging time-stamped action sequences. These sequences pose a rich source of information that may facilitate investigating how examinees approach an item and arrive at their given response. There is a rich body of research leveraging action sequence data for investigating examinees’ behavior. However, the associated timing data have been considered mainly on the item-level, if at all. Considering timing data on the action-level in addition to action sequences, however, has vast potential to support a more fine-grained assessment of examinees’ behavior. We provide an approach that jointly considers action sequences and action-level times for identifying common response processes. In doing so, we integrate tools from clickstream analyses and graph-modeled data clustering with psychometrics. In our approach, we (a) provide similarity measures that are based on both actions and the associated action-level timing data and (b) subsequently employ cluster edge deletion for identifying homogeneous, interpretable, well-separated groups of action patterns, each describing a common response process. Guidelines on how to apply the approach are provided. The approach and its utility are illustrated on a complex problem-solving item from PIAAC 2012.
more »
« less
This content will become publicly available on July 17, 2026
Understanding MOOC Stopout Patterns: Course and Assessment-Level Insights
This study investigates stopout patterns in MOOCs to understand course and assessment-level factors that influence student stopout behavior. We expanded previous work on stopout by assessing the exponential decay of assessment-level stopout rates across courses. Results confirm a disproportionate stopout rate on the first graded assessment. We then evaluated which course and assessment level features were associated with stopout on the first assessment. Findings suggest that a higher number of questions and estimated time commitment in the early assessments and more assessments in a course may be associated with a higher proportion of early stopout behavior.
more »
« less
- Award ID(s):
- 1931419
- PAR ID:
- 10648066
- Publisher / Repository:
- ACM
- Date Published:
- Page Range / eLocation ID:
- 300 to 304
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In the United States, the onset of COVID-19 triggered a nationwide lockdown, which forced many universities to move their primary assessments from invigilated in-person exams to unproctored online exams. This abrupt change occurred midway through the Spring 2020 semester, providing an unprecedented opportunity to investigate whether online exams can provide meaningful assessments of learning relative to in-person exams on a per-student basis. Here, we present data from nearly 2,000 students across 18 courses at a large Midwestern University. Using a meta-analytic approach in which we treated each course as a separate study, we showed that online exams produced scores that highly resembled those from in-person exams at an individual level despite the online exams being unproctored—as demonstrated by a robust correlation between online and in-person exam scores. Moreover, our data showed that cheating was either not widespread or ineffective at boosting scores, and the strong assessment value of online exams was observed regardless of the type of questions asked on the exam, the course level, academic discipline, or class size. We conclude that online exams, even when unproctored, are a viable assessment tool.more » « less
-
We evaluate the impact of an institutional effort to transform undergraduate science courses using an approach based on course assessments. The approach is guided by A Framework for K-12 Science Education and focuses on scientific and engineering practices, crosscutting concepts, and core ideas, together called three-dimensional learning. To evaluate the extent of change, we applied the Three-dimensional Learning Assessment Protocol to 4 years of chemistry, physics, and biology course exams. Changes in exams differed by discipline and even by course, apparently depending on an interplay between departmental culture, course organization, and perceived course ownership, demonstrating the complex nature of transformation in higher education. We conclude that while transformation must be supported at all organizational levels, ultimately, change is controlled by factors at the course and departmental levels.more » « less
-
Abstract Argumentation, a key scientific practice presented in theFramework for K-12 Science Education, requires students to construct and critique arguments, but timely evaluation of arguments in large-scale classrooms is challenging. Recent work has shown the potential of automated scoring systems for open response assessments, leveraging machine learning (ML) and artificial intelligence (AI) to aid the scoring of written arguments in complex assessments. Moreover, research has amplified that the features (i.e., complexity, diversity, and structure) of assessment construct are critical to ML scoring accuracy, yet how the assessment construct may be associated with machine scoring accuracy remains unknown. This study investigated how the features associated with the assessment construct of a scientific argumentation assessment item affected machine scoring performance. Specifically, we conceptualized the construct in three dimensions: complexity, diversity, and structure. We employed human experts to code characteristics of the assessment tasks and score middle school student responses to 17 argumentation tasks aligned to three levels of a validated learning progression of scientific argumentation. We randomly selected 361 responses to use as training sets to build machine-learning scoring models for each item. The scoring models yielded a range of agreements with human consensus scores, measured by Cohen’s kappa (mean = 0.60; range 0.38 − 0.89), indicating good to almost perfect performance. We found that higher levels ofComplexityandDiversity of the assessment task were associated with decreased model performance, similarly the relationship between levels ofStructureand model performance showed a somewhat negative linear trend. These findings highlight the importance of considering these construct characteristics when developing ML models for scoring assessments, particularly for higher complexity items and multidimensional assessments.more » « less
-
In this work, we present the design and plan of Quantum machine learning (QML) course in a computer science (CS) University program at senior undergraduate level / first year graduate level. Based on our survey, there is a lack of detailed design and assessment plan for the delivery of QML course. In this paper we have presented the QML course design with week by week details of QML concepts and hands on activities that are covered in the course. We also present how this QML course can be assessed from CS program learning outcomes perspective.more » « less
An official website of the United States government
