Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Benjamin, Paaßen ; Carrie, Demmans Epp (Ed.)This paper was written with the help of ChatGPT. Recent advancements in the development and deployment of large generative language models to power generative AI tools, including OpenAIż˝fs ChatGPT, have led to their broad usage across virtually all fields of study. While the tools have been trained to generate human-like-dialogue in response to questions or prompts, they are similarly used to compose larger, more complex artifacts, including social media posts, essays, and even research articles. Although this abstract has been written entirely by a human without any input, consultation, or revision from a generative language model, it would likely be difficult to discern any difference as a reader. In light of this, there is growing debate and concern regarding using these models to aid the writing process, particularly concerning publication. Aside from some notable risks, including the unintentional generation of false information, citation of non-existing research articles, or plagiarism by generating text that is sampled from another source without proper citation, there are additional questions pertaining to the originality of ideas expressed in a work has been partially-written or revised by a generative language model. We present this paper as both a case study into the usage of generative models to aid in the writing of academic research articles but also as an example of how transparency and open science practices may help in addressing several issues that have been raised in other contexts and communities. While this paper neither attempts to promote nor contest the use of these language models in any writing task, it is the goal of this work to provide insight and potential guidance into the ethical and effective usage of these models within this domain.more » « lessFree, publicly-accessible full text available July 14, 2025
-
We introduce a working approach that combines the method of fine-tuning large language models (LLMs) to create augmented data for the regression predictive models aimed at detecting at-risk students in online learning communities. This approach has the potential to leverage scarce data to improve urgency detection, and it can also present the role of artificial intelligence in enhancing the resilience of educational communities and ensuring timely interventions within online learning settings.more » « lessFree, publicly-accessible full text available June 10, 2025
-
Benjamin, Paaßen ; Carrie, Demmans Epp (Ed.)K-12 Computer Science (CS) education has seen remarkable growth recently, driven by the increasing focus on CS and Computational Thinking (CT) integration. Despite the abundance of Professional development (PD) programs designed to prepare future CS teachers with the required knowledge and skills, there is a lack of research on how teachers' perceptions and attitudes of CS and CT evolve before and after participating in these programs. To address this gap, our exploratory study aims to study the dynamics of pre-and in-service teachers' experiences, attitudes, and perceptions towards CS and CT through their participation in a K-12 CS education micro-credential program. In this study, we employed topic modeling to identify topics that emerged from teachers' written pre- and post-CS autobiographies, conducted statistical analysis to explore how these topics evolve over time and applied regression analysis to investigate the factors influencing these dynamics. We observed a shift in teachers' initial feelings of fear, intimidation, and stress towards confidence, fun, and feeling competent in basic CS, reflecting a positive transformation. Regression analysis revealed that features, such as experienced teacher status and CT conceptual understanding, correlate with participants' evolving views. These observed relationships highlight the micro-credential's role in not only enhancing technical competency but also fostering an adaptive, integrative pedagogical mindset, providing new insights for course design.more » « lessFree, publicly-accessible full text available July 14, 2025
-
Benjamin, Paaßen ; Carrie, Demmans Epp (Ed.)With the support of digital learning platforms, synchronous and collaborative learning has become a prominent learning paradigm in mathematics education. Computer-Supported Collaborative Learning (CSCL) has emerged as a valuable tool for enhancing mathematical discourse, problem solving, and ultimately learning outcomes. This paper presents an innovative examination of Graspable Math (GM), a dynamic mathematic notation and learning online platform, to enable synchronous, collaborative learning between pairs of students. Through analyzing students' online log data, we adopt a data-driven method to better understand the intricate dynamics of collaborative learning in mathematics as it happens. Specifically, we apply frequency distributions, cluster analysis to present students' dynamic interaction patterns and identify distinctive profiles of collaboration. Our findings reveal several collaboration profiles that emerge through these analyses. This research not only bridges the gap in current CSCL tools for mathematics, but also provides empirical insights into the effective design and implementation of such tools. The insights gained from this research offer implications for the design of digital learning tools that support effective and engaging collaborative learning experiences.more » « lessFree, publicly-accessible full text available July 14, 2025
-
Prior work analyzing tutoring sessions provided evidence that highly effective tutors, through their interaction with students and their experience, can perceptively recognize incorrect processes or “bugs” when students incorrectly answer problems. Researchers have studied these tutoring interactions examining instructional approaches to address incorrect processes and observed that the format of the feedback can influence learning outcomes. In this work, we recognize the incorrect answers caused by these buggy processes as Common Wrong Answers (CWAs). We examine the ability of teachers and instructional designers to identify CWAs proactively. As teachers and instructional designers deeply understand the common approaches and mistakes students make when solving mathematical problems, we examine the feasibility of proactively identifying CWAs and generating Common Wrong Answer Feedback (CWAFs) as a formative feedback intervention for addressing student learning needs. As such, we analyze CWAFs in three sets of analyses. We first report on the accuracy of the CWAs predicted by the teachers and instructional designers on the problems across two activities. We then measure the effectiveness of the CWAFs using an intent-to-treat analysis. Finally, we explore the existence of personalization effects of the CWAFs for the students working on the two mathematics activities.more » « less
-
Abstract Randomized controlled trials (RCTs) admit unconfounded design-based inference – randomization largely justifies the assumptions underlying statistical effect estimates – but often have limited sample sizes. However, researchers may have access to big observational data on covariates and outcomes from RCT nonparticipants. For example, data from A/B tests conducted within an educational technology platform exist alongside historical observational data drawn from student logs. We outline a design-based approach to using such observational data for variance reduction in RCTs. First, we use the observational data to train a machine learning algorithm predicting potential outcomes using covariates and then use that algorithm to generate predictions for RCT participants. Then, we use those predictions, perhaps alongside other covariates, to adjust causal effect estimates with a flexible, design-based covariate-adjustment routine. In this way, there is no danger of biases from the observational data leaking into the experimental estimates, which are guaranteed to be exactly unbiased regardless of whether the machine learning models are “correct” in any sense or whether the observational samples closely resemble RCT samples. We demonstrate the method in analyzing 33 randomized A/B tests and show that it decreases standard errors relative to other estimators, sometimes substantially.
-
The development and application of deep learning method- ologies has grown within educational contexts in recent years. Perhaps attributable, in part, to the large amount of data that is made avail- able through the adoption of computer-based learning systems in class- rooms and larger-scale MOOC platforms, many educational researchers are leveraging a wide range of emerging deep learning approaches to study learning and student behavior in various capacities. Variations of recurrent neural networks, for example, have been used to not only pre- dict learning outcomes but also to study sequential and temporal trends in student data; it is commonly believed that they are able to learn high- dimensional representations of learning and behavioral constructs over time, such as the evolution of a students’ knowledge state while working through assigned content. Recent works, however, have started to dis- pute this belief, instead finding that it may be the model’s complexity that leads to improved performance in many prediction tasks and that these methods may not inherently learn these temporal representations through model training. In this work, we explore these claims further in the context of detectors of student affect as well as expanding on exist- ing work that explored benchmarks in knowledge tracing. Specifically, we observe how well trained models perform compared to deep learning networks where training is applied only to the output layer. While the highest results of prior works utilizing trained recurrent models are found to be superior, the application of our untrained-versions perform compa- rably well, outperforming even previous non-deep learning approaches.more » « less
-
As computer-based learning platforms have become ubiq- uitous, there is a growing need to better support teachers. Particularly in mathematics, teachers often rely on open- ended questions to assess students’ understanding. While prior works focusing on the development of automated open- ended work assessments have demonstrated their potential, many of those methods require large amounts of student data to make reliable estimates. We explore whether a prob- lem specific automated scoring model could benefit from auxiliary data collected from similar problems to address this “cold start” problem. We examine factors such as sam- ple size and the magnitude of similarity of utilized problem data. We find the use of data from similar problems not only provides benefits to improve predictive performance by in- creasing sample size, but also leads to greater overall model performance than using data solely from the original prob- lem when sample size is held constant.more » « less
-
Prior works have led to the development and application of automated assessment methods that leverage machine learning and nat- ural language processing. The performance of these methods have often been reported as being positive, but other prior works have identified aspects on which they may be improved. Particularly in the context of mathematics, the presence of non-linguistic characters and expressions have been identified to contribute to observed model error. In this paper, we build upon this prior work by observing a developed automated as- sessment model for open-response questions in mathematics. We develop a new approach which we call the “Math Term Frequency” (MTF) model to address this issue caused by the presence of non-linguistic terms and ensemble it with the previously-developed assessment model. We observe that the inclusion of this approach notably improves model performance, and present an example of practice of how error analyses can be leveraged to address model limitations.more » « less