Search for: All records

Award ID contains: 1931419

« Prev Next »

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Precise unbiased estimation in randomized experiments using auxiliary observational data

https://doi.org/10.1515/jci-2022-0011

Gagnon-Bartsch, Johann A.; Sales, Adam C.; Wu, Edward; Botelho, Anthony F.; Erickson, John A.; Miratrix, Luke W.; Heffernan, Neil T. (January 2023, Journal of Causal Inference)

Abstract Randomized controlled trials (RCTs) admit unconfounded design-based inference – randomization largely justifies the assumptions underlying statistical effect estimates – but often have limited sample sizes. However, researchers may have access to big observational data on covariates and outcomes from RCT nonparticipants. For example, data from A/B tests conducted within an educational technology platform exist alongside historical observational data drawn from student logs. We outline a design-based approach to using such observational data for variance reduction in RCTs. First, we use the observational data to train a machine learning algorithm predicting potential outcomes using covariates and then use that algorithm to generate predictions for RCT participants. Then, we use those predictions, perhaps alongside other covariates, to adjust causal effect estimates with a flexible, design-based covariate-adjustment routine. In this way, there is no danger of biases from the observational data leaking into the experimental estimates, which are guaranteed to be exactly unbiased regardless of whether the machine learning models are “correct” in any sense or whether the observational samples closely resemble RCT samples. We demonstrate the method in analyzing 33 randomized A/B tests and show that it decreases standard errors relative to other estimators, sometimes substantially.
more » « less
Full Text Available
How to Open Science: Analyzing the Open Science Statement Compliance of the Learning @ Scale Conference

https://doi.org/10.1145/3573051.3596166

Haim, Aaron; Baxter, Chris; Gyurcsan, Robert; Shaw, Stacy T.; Heffernan, Neil T. (July 2023, L@S '23: Proceedings of the Tenth ACM Conference on Learning @ Scale)

There have been numerous efforts documenting the effects of open science in existing papers; however, these efforts typically only consider the author's analyses and supplemental materials from the papers. While understanding the current rate of open science adoption is important, it is also vital that we explore the factors that may encourage such adoption. One such factor may be publishing organizations setting open science requirements for submitted articles: encouraging researchers to adopt more rigorous reporting and research practices. For example, within the education technology discipline, theACM Conference on Learning @ Scale (L@S) has been promoting open science practices since 2018 through a Call For Papers statement. The purpose of this study was to replicate previous papers within the proceedings of L@S and compare the degree of open science adoption and robust reproducibility practices to other conferences in education technology without a statement on open science. Specifically, we examined 93 papers and documented the open science practices used. We then attempted to reproduce the results with invitation from authors to bolster the chance of success. Finally, we compared the overall adoption rates to those from other conferences in education technology. Although the overall responses to the survey were low, our cursory review suggests that researchers at L@S might be more familiar with open science practices compared to the researchers who published in the International Conference on Artificial Intelligence in Education (AIED) and the International Conference on Educational Data Mining (EDM): 13 of 28 AIED and EDM responses were unfamiliar with preregistrations and 7 unfamiliar with preprints, while only 2 of 7 L@S responses were unfamiliar with preregistrations and 0 with preprints. The overall adoption of open science practices at L@S was much lower with only 1% of papers providing open data, 5% providing open materials, and no papers had a preregistration. All openly accessible work can be found in an Open Science Framework project.
more » « less
Full Text Available
How Common are Common Wrong Answers? Crowdsourcing Remediation at Scale

https://doi.org/10.1145/3573051.3593390

Gurung, Ashish; Baral, Sami; Lee, Morgan P.; Sales, Adam C.; Haim, Aaron; Vanacore, Kirk P.; McReynolds, Andrew A.; Kreisberg, Hilary; Heffernan, Cristina; Heffernan, Neil T. (July 2023, L@S '23: Proceedings of the Tenth ACM Conference on Learning @ Scale)

Solving mathematical problems is cognitively complex, involving strategy formulation, solution development, and the application of learned concepts. However, gaps in students' knowledge or weakly grasped concepts can lead to errors. Teachers play a crucial role in predicting and addressing these difficulties, which directly influence learning outcomes. However, preemptively identifying misconceptions leading to errors can be challenging. This study leverages historical data to assist teachers in recognizing common errors and addressing gaps in knowledge through feedback. We present a longitudinal analysis of incorrect answers from the 2015-2020 academic years on two curricula, Illustrative Math and EngageNY, for grades 6, 7, and 8. We find consistent errors across 5 years despite varying student and teacher populations. Based on these Common Wrong Answers (CWAs), we designed a crowdsourcing platform for teachers to provide Common Wrong Answer Feedback (CWAF). This paper reports on an in vivo randomized study testing the effectiveness of CWAFs in two scenarios: next-problem-correctness within-skill and next-problem-correctness within-assignment, regardless of the skill. We find that receiving CWAF leads to a significant increase in correctness for consecutive problems within-skill. However, the effect was not significant for all consecutive problems within-assignment, irrespective of the associated skill. This paper investigates the potential of scalable approaches in identifying Common Wrong Answers (CWAs) and how the use of crowdsourced CWAFs can enhance student learning through remediation.
more » « less
Full Text Available
No Benefit for High-Dosage Time Management Interventions in Online Courses

https://doi.org/10.1145/3573051.3596176

Zhang, Jiayi; Baker, Ryan S.; Farmer, Thomas (July 2023, L@S '23: Proceedings of the Tenth ACM Conference on Learning @ Scale)

In past work, time management interventions involving prompts, alerts, and planning tools have successfully nudged students in online courses, leading to higher engagement and improved performance. However, few studies have investigated the effectiveness of these interventions over time, understanding if the effectiveness maintains or changes based on dosage (i.e., how often an intervention is provided). In the current study, we conducted a randomized controlled trial to test if the effect of a time management intervention changes over repeated use. Students at an online computer science course were randomly assigned to receive interventions based on two schedules (i.e., high-dosage vs. low-dosage). We ran a two-way mixed ANOVA, comparing students' assignment start time and performance across several weeks. Unexpectedly, we did not find a significant main effect from the use of the intervention, nor was there an interaction effect between the use of the intervention and week of the course.
more » « less
Full Text Available
Exploring Cross-Country Prediction Model Generalizability in MOOCs

https://doi.org/10.1145/3573051.3593380

Andres-Bray, Juan-Miguel; Hutt, Stephen; Baker, Ryan S. (July 2023, L@S '23: Proceedings of the Tenth ACM Conference on Learning @ Scale)

Massive Open Online Courses (MOOCs) have increased the accessibility of quality educational content to a broader audience across a global network. They provide access for students to material that would be difficult to obtain locally, and an abundance of data for educational researchers. Despite the international reach of MOOCs, however, the majority of MOOC research does not account for demographic differences relating to the learners' country of origin or cultural background, which have been shown to have implications on the robustness of predictive models and interventions. This paper presents an exploration into the role of nation-level metrics of culture, happiness, wealth, and size on the generalizability of completion prediction models across countries. The findings indicate that various dimensions of culture are predictive of cross-country model generalizability. Specifically, learners from indulgent, collectivist, uncertainty-accepting, or short-term oriented, countries produce more generalizable predictive models of learner completion.
more » « less
Full Text Available
The Right To Be Forgotten and Educational Data Mining: Challenges and Paths Forward

Hutt, Stephen; Das, Sanchari; Baker, Ryan (July 2023, Proceedings of the 16th International Conference on Educational Data Mining)

The General Data Protection Regulation (GDPR) in the European Union contains directions on how user data may be collected, stored, and when it must be deleted. As similar legislation is developed around the globe, there is the potential for repercussions across multiple fields of research, including educational data mining (EDM). Over the past two decades, the EDM community has taken consistent steps to protect learner privacy within our research, whilst pursuing goals that will benefit their learning. However, recent privacy legislation may cause our practices to need to change. The right to be forgotten states that users have the right to request that all their data (including deidentified data generated by them) be removed. In this paper, we discuss the potential challenges of this legislation for EDM research, including impacts on Open Science practices, Data Modeling, and Data sharing. We also consider changes to EDM best practices that may aid compliance with this new legislation.
more » « less
Full Text Available
How to Open Science: Developing and Testing Reproducibility Metrics on the Educational Data Mining Conference

Haim, A.; Gyurcssan, R.; Baxter, C.; Shaw, S.; Heffernan, N. (July 2023, Proceedings of the 16th International Conference on Educational Data Mining)

Despite increased efforts to assess the adoption rates of open science and robustness of reproducibility in sub-disciplines of education technology, there is a lack of understanding of why some research is not reproducible. Prior work has taken the first step toward assessing reproducibility of research, but has assumed certain constraints which hinder its discovery. Thus, the purpose of this study was to replicate previous work on papers within the proceedings of the International Conference on Educational Data Mining to accurately report on which papers are reproducible and why. Specifically, we examined 208 papers, attempted to reproduce them, documented reasons for reproducibility failures, and asked authors to provide additional information needed to reproduce their study. Our results showed that out of 12 papers that were potentially reproducible, only one successfully reproduced all analyses, and another two reproduced most of the analyses. The most common failure for reproducibility was failure to mention libraries needed, followed by non-seeded randomness.
more » « less
Full Text Available
Knowledge Tracing Over Time: A Longitudinal Analysis

Lee, Morgan; Croteau, Ethan; Gurung, Ashish; Botelho, Anthony; Heffernan, Neil (July 2023, Proceedings of the 16th International Conference on Educational Data Mining)

The use of Bayesian Knowledge Tracing (BKT) models in predicting student learning and mastery, especially in mathematics, is a well-established and proven approach in learning analytics. In this work, we report on our analysis examining the generalizability of BKT models across academic years attributed to "detector rot." We compare the generalizability of Knowledge Training (KT) models by comparing model performance in predicting student knowledge within the academic year and across academic years. Models were trained on data from two popular open-source curricula available through Open Educational Resources. We observed that the models generally were highly performant in predicting student learning within an academic year, whereas certain academic years were more generalizable than other academic years. We posit that the Knowledge Tracing models are relatively stable in terms of performance across academic years yet can still be susceptible to systemic changes and underlying learner behavior. As indicated by the evidence in this paper, we posit that learning platforms leveraging KT models need to be mindful of systemic changes or drastic changes in certain user demographics.
more » « less
Full Text Available
Towards Generalizable Detection of Urgency of Discussion Forum Posts

Švábenský, Valdemar; Baker, Ryan; Zambrano, Andrés; Zou, Yishan; Slater, Stefan (July 2023, Proceedings of the 16th International Conference on Educational Data Mining)

Students who take an online course, such as a MOOC, use the course's discussion forum to ask questions or reach out to instructors when encountering an issue. However, reading and responding to students' questions is difficult to scale because of the time needed to consider each message. As a result, critical issues may be left unresolved, and students may lose the motivation to continue in the course. To help address this problem, we build predictive models that automatically determine the urgency of each forum post, so that these posts can be brought to instructors' attention. This paper goes beyond previous work by predicting not just a binary decision cut-off but a post's level of urgency on a 7-point scale. First, we train and cross-validate several models on an original data set of 3,503 posts from MOOCs at University of Pennsylvania. Second, to determine the generalizability of our models, we test their performance on a separate, previously published data set of 29,604 posts from MOOCs at Stanford University. While the previous work on post urgency used only one data set, we evaluated the prediction across different data sets and courses. The best-performing model was a support vector regressor trained on the Universal Sentence Encoder embeddings of the posts, achieving an RMSE of 1.1 on the training set and 1.4 on the test set. Understanding the urgency of forum posts enables instructors to focus their time more effectively and, as a result, better support student learning.
more » « less
Full Text Available
Auto-scoring Student Responses with Images in Mathematics

Baral, Sami; Botelho, Anthony; Santhanam, Abhishek; Gurung, Ashish; Cheng, Li; Heffernan, Neil (July 2023, Proceedings of the 16th International Conference on Educational Data Mining)

Teachers often rely on the use of a range of open-ended problems to assess students' understanding of mathematical concepts. Beyond traditional conceptions of student open-ended work, commonly in the form of textual short-answer or essay responses, the use of figures, tables, number lines, graphs, and pictographs are other examples of open-ended work common in mathematics. While recent developments in areas of natural language processing and machine learning have led to automated methods to score student open-ended work, these methods have largely been limited to textual answers. Several computer-based learning systems allow students to take pictures of hand-written work and include such images within their answers to open-ended questions. With that, however, there are few-to-no existing solutions that support the auto-scoring of student hand-written or drawn answers to questions. In this work, we build upon an existing method for auto-scoring textual student answers and explore the use of OpenAI/CLIP, a deep learning embedding method designed to represent both images and text, as well as Optical Character Recognition (OCR) to improve model performance. We evaluate the performance of our method on a dataset of student open-responses that contains both text- and image-based responses, and find a reduction of model error in the presence of images when controlling for other answer-level features.
more » « less
Full Text Available

« Prev Next »