skip to main content


Search for: All records

Award ID contains: 1931523

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Despite increased efforts to assess the adoption rates of open science and robustness of reproducibility in sub-disciplines of education technology, there is a lack of understanding of why some research is not reproducible. Prior work has taken the first step toward assessing reproducibility of research, but has assumed certain constraints which hinder its discovery. Thus, the purpose of this study was to replicate previous work on papers within the proceedings of the International Conference on Educational Data Mining and develop metrics to accurately report on which papers are reproducible and why. Specifically, we examined 208 papers, attempted to reproduce them, documented reasons for reproducibility failures, and asked authors to provide additional information needed to reproduce their study. Our results showed that out of 12 papers that were potentially reproducible, only one successfully reproduced all analyses, and another two reproduced most of the analyses. The most common failure for reproducibility was failure to mention libraries needed, followed by non-seeded randomness. All openly accessible work can be found in an Open Science Foundation project1. 
    more » « less
    Free, publicly-accessible full text available July 1, 2024
  2. Many online learning platforms and MOOCs incorporate some amount of video-based content into their platform, but there are few randomized controlled experiments that evaluate the effective- ness of the different methods of video integration. Given the large amount of publicly available educational videos, an investigation into this content’s impact on students could help lead to more ef- fective and accessible video integration within learning platforms. In this work, a new feature was added into an existing online learn- ing platform that allowed students to request skill-related videos while completing their online middle-school mathematics assign- ments. A total of 18,535 students participated in two large-scale randomized controlled experiments related to providing students with publicly available educational videos. The first experiment investigated the effect of providing students with the opportunity to request these videos, and the second experiment investigated the effect of using a multi-armed bandit algorithm to recommend relevant videos. Additionally, this work investigated which features of the videos were significantly predictive of students’ performance and which features could be used to personalize students’ learning. Ultimately, students were mostly disinterested in the skill-related videos, preferring instead to use the platforms existing problem- specific support, and there was no statistically significant findings in either experiment. Additionally, while no video features were significantly predictive of students’ performance, two video fea- tures had significant qualitative interactions with students’ prior knowledge, which showed that different content creators were more effective for different groups of students. These findings can be used to inform the design of future video-based features within online learning platforms and the creation of different educational videos specifically targeting higher or lower knowledge students. 
    more » « less
    Free, publicly-accessible full text available July 1, 2024
  3. The process of synthesizing solutions for mathematical problems is cognitively complex. Students formulate and implement strate- gies to solve mathematical problems, develop solutions, and make connections between their learned concepts as they apply their reasoning skills to solve such problems. The gaps in student knowl- edge or shallowly-learned concepts may cause students to guess at answers or otherwise apply the wrong approach, resulting in errors in their solutions. Despite the complexity of the synthesis process in mathematics learning, teachers’ knowledge and ability to anticipate areas of potential difficulty is essential and correlated with student learning outcomes. Preemptively identifying the common miscon- ceptions in students that result in subsequent incorrect attempts can be arduous and unreliable, even for experienced teachers. This pa- per aims to help teachers identify the subsequent incorrect attempts that commonly occur when students are working on math problems such that they can address the underlying gaps in knowledge and common misconceptions through feedback. We report on a longi- tudinal analysis of historical data, from a computer-based learning platform, exploring the incorrect answers in the prior school years (’15-’20) that establish the commonality of wrong answers on two Open Educational Resources (OER) curricula–Illustrative Math (IM) and EngageNY (ENY) for grades 6, 7, and 8. We observe that incor- rect answers are pervasive across 5 academic years despite changes in underlying student and teacher population. Building on our find- ings regarding the Common Wrong Answers (CWAs), we report on goals and task analysis that we leveraged in designing and develop- ing a crowdsourcing platform for teachers to write Common Wrong Answer Feedback (CWAF) aimed are remediating the underlying cause of the CWAs. Finally, we report on an in vivo study by analyz- ing the effectiveness of CWAFs using two approaches; first, we use next-problem-correctness as a dependent measure after receiving CWAF in an intent-to-treat second, using next-attempt correctness as a dependent measure after receiving CWAF in a treated analysis. With the rise in popularity and usage of computer-based learning platforms, this paper explores the potential benefits of scalability in identifying CWAs and the subsequent usage of crowd-sourced CWAFs in enhancing the student learning experience through re- mediation. 
    more » « less
    Free, publicly-accessible full text available July 1, 2024
  4. There have been numerous efforts documenting the effects of open science in existing papers; however, these efforts typically only consider the author’s analyses and supplemental materials from the papers. While understanding the current rate of open science adoption is important, it is also vital that we explore the factors that may encourage such adoption. One such factor may be publishing organizations setting open science requirements of submitted arti- cles: encouraging researchers to adopt more rigorous reporting and research practices. For example, within the education technology discipline, the ACM Conference on Learning @ Scale (L@S) has been promoting open science practices since 2018 through a Call For Pa- pers statement. The purpose of this study was to replicate previous papers within the proceedings of L@S and compare the degree of open science adoption and robust reproducibility practices to other conferences in education technology without a statement on open science. Specifically, we examined 93 papers and documented the open science practices used. We then attempted to reproduce the results with intervention from authors to bolster the chance of suc- cess. Finally, we compared the overall adoption rates to those from other conferences in education technology. Our cursory review sug- gests that researchers at L@S were more knowledgeable in open science practices, such as preregistration or preprints, compared to the researchers who published in International Conference on Artificial Intelligence in Education and the International Conference on Educational Data Mining as they were less likely to say they were unfamiliar with the practices. However, the overall adoption of open science practices was significantly lower with only 1% of papers providing open data, 5% providing open materials, and no papers with a preregistration. Based on speculation, the low adoption rates maybe due to 20% of the papers not using a dataset, at-scale datasets and materials that were unable to be released to avoid security issues or sensitive data leaks, or that data were being used in ongoing research and are not considered complete enough for release by the authors. All openly accessible work can be found in an Open Science Framework project 
    more » « less
    Free, publicly-accessible full text available July 1, 2024
  5. This work proposes Dynamic Linear Epsilon-Greedy, a novel con- textual multi-armed bandit algorithm that can adaptively assign personalized content to users while enabling unbiased statistical analysis. Traditional A/B testing and reinforcement learning ap- proaches have trade-offs between empirical investigation and max- imal impact on users. Our algorithm seeks to balance these objec- tives, allowing platforms to personalize content effectively while still gathering valuable data. Dynamic Linear Epsilon-Greedy was evaluated via simulation and an empirical study in the ASSIST- ments online learning platform. In simulation, Dynamic Linear Epsilon-Greedy performed comparably to existing algorithms and in ASSISTments, slightly increased students’ learning compared to A/B testing. Data collected from its recommendations allowed for the identification of qualitative interactions, which showed high and low knowledge students benefited from different content. Dynamic Linear Epsilon-Greedy holds promise as a method to bal- ance personalization with unbiased statistical analysis. All the data collected during the simulation and empirical study are publicly available at https://osf.io/zuwf7/. 
    more » « less
    Free, publicly-accessible full text available June 1, 2024
  6. There is a growing need to empirically evaluate the quality of online instructional interventions at scale. In response, some online learning platforms have begun to implement rapid A/B testing of instructional interventions. In these scenarios, students participate in series of randomized ex- periments that evaluate problem-level interventions in quick succession, which makes it difficult to discern the effect of any particular intervention on their learning. Therefore, dis- tal measures of learning such as posttests may not provide a clear understanding of which interventions are effective, which can lead to slow adoption of new instructional meth- ods. To help discern the effectiveness of instructional in- terventions, this work uses data from 26,060 clickstream se- quences of students across 31 different online educational experiments exploring 51 different research questions and the students’ posttest scores to create and analyze different proximal surrogate measures of learning that can be used at the problem level. Through feature engineering and deep learning approaches, next-problem correctness was deter- mined to be the best surrogate measure. As more data from online educational experiments are collected, model based surrogate measures can be improved, but for now, next-problem correctness is an empirically effective proximal surrogate measure of learning for analyzing rapid problem- level experiments. The data and code 
    more » « less
    Free, publicly-accessible full text available June 1, 2024
  7. Randomized A/B tests within online learning platforms represent an exciting direction in learning sci- ences. With minimal assumptions, they allow causal effect estimation without confounding bias and exact statistical inference even in small samples. However, often experimental samples and/or treat- ment effects are small, A/B tests are under-powered, and effect estimates are overly imprecise. Recent methodological advances have shown that power and statistical precision can be substantially boosted by coupling design-based causal estimation to machine-learning models of rich log data from historical users who were not in the experiment. Estimates using these techniques remain unbiased and inference remains exact without any additional assumptions. This paper reviews those methods and applies them to a new dataset including over 250 randomized A/B comparisons conducted within ASSISTments, an online learning platform. We compare results across experiments using four novel deep-learning models of auxiliary data, and show that incorporating auxiliary data into causal estimates is roughly equivalent to increasing the sample size by 20% on average, or as much as 50-80% in some cases, relative to t-tests, and by about 10% on average, or as much as 30-50%, compared to cutting-edge machine learning unbiased estimates that use only data from the experiments. We show the gains can be even larger for estimating subgroup effects, that they hold even when the remnant is unrepresentative of the A/B test sample, and extend to post-stratification population effects estimators. 
    more » « less
    Free, publicly-accessible full text available June 1, 2024
  8. Randomized controlled trials (RCTs) are increasingly prevalent in education research, and are often regarded as a gold standard of causal inference. Two main virtues of randomized experiments are that they (1) do not suffer from confounding, thereby allowing for an unbiased estimate of an intervention's causal impact, and (2) allow for design-based inference, meaning that the physical act of randomization largely justifies the statistical assumptions made. However, RCT sample sizes are often small, leading to low precision; in many cases RCT estimates may be too imprecise to guide policy or inform science. Observational studies, by contrast, have strengths and weaknesses complementary to those of RCTs. Observational studies typically offer much larger sample sizes, but may suffer confounding. In many contexts, experimental and observational data exist side by side, allowing the possibility of integrating "big observational data" with "small but high-quality experimental data" to get the best of both. Such approaches hold particular promise in the field of education, where RCT sample sizes are often small due to cost constraints, but automatic collection of observational data, such as in computerized educational technology applications, or in state longitudinal data systems (SLDS) with administrative data on hundreds of thousand of students, has made rich, high-dimensional observational data widely available. We outline an approach that allows one to employ machine learning algorithms to learn from the observational data, and use the resulting models to improve precision in randomized experiments. Importantly, there is no requirement that the machine learning models are "correct" in any sense, and the final experimental results are guaranteed to be exactly unbiased. Thus, there is no danger of confounding biases in the observational data leaking into the experiment. 
    more » « less
    Free, publicly-accessible full text available May 1, 2024
  9. Within the field of education technology, learning analytics has increased in popularity over the past decade. Researchers conduct experiments and develop software, building on each other’s work to create more intricate systems. In parallel, open science — which describes a set of practices to make research more open, transparent, and reproducible — has exploded in recent years, resulting in more open data, code, and materials for researchers to use. However, without prior knowledge of open science, many researchers do not make their datasets, code, and materials openly available, and those that are available are often difficult, if not impossible, to reproduce. The purpose of the current study was to take a close look at our field by examining previous papers within the proceedings of the International Conference on Learning Analytics and Knowledge, and document the rate of open science adoption (e.g., preregistration, open data), as well as how well available data and code could be reproduced. Specifically, we examined 133 research papers, allowing ourselves 15 minutes for each paper to identify open science practices and attempt to reproduce the results according to their provided specifications. Our results showed that less than half of the research adopted standard open science principles, with approximately 5% fully meeting some of the defined principles. Further, we were unable to reproduce any of the papers successfully in the given time period. We conclude by providing recommendations on how to improve the reproducibility of our research as a field moving forward. All openly accessible work can be found in an Open Science Foundation project1. 
    more » « less
  10. As evidence grows supporting the importance of non-cognitive factors in learning, computer-assisted learning platforms increasingly incorporate non-academic interventions to influence student learning and learning related-behaviors. Non-cognitive interventions often attempt to influence students’ mindset, motivation, or metacognitive reflection to impact learning behaviors and outcomes. In the current paper, we analyze data from five experiments, involving seven treatment conditions embedded in mastery-based learning activities hosted on a computer-assisted learning platform focused on middle school mathematics. Each treatment condition embodied a specific non-cognitive theoretical perspective. Over seven school years, 20,472 students participated in the experiments. We estimated the effects of each treatment condition on students’ response time, hint usage, likelihood of mastering knowledge components, learning efficiency, and post-tests performance. Our analyses reveal a mix of both positive and negative treatment effects on student learning behaviors and performance. Few interventions impacted learning as assessed by the post-tests. These findings highlight the difficulty in positively influencing student learning behaviors and outcomes using non-cognitive interventions. 
    more » « less