skip to main content


Search for: All records

Creators/Authors contains: "Heffernan, Neil T"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Benjamin, Paaßen ; Carrie, Demmans Epp (Ed.)
    The educational data mining community has extensively investigated affect detection in learning platforms, finding associations between affective states and a wide range of learning outcomes. Based on these insights, several studies have used affect detectors to create interventions tailored to respond to when students are bored, confused, or frustrated. However, these detector-based interventions have depended on detecting affect when it occurs and therefore inherently respond to affective states after they have begun. This might not always be soon enough to avoid a negative experience for the student. In this paper, we aim to predict students' affective states in advance. Within our approach, we attempt to determine the maximum prediction window where detector performance remains sufficiently high, documenting the decay in performance when this prediction horizon is increased. Our results indicate that it is possible to predict confusion, frustration, and boredom in advance with performance over chance for prediction horizons of 120, 40, and 50 seconds, respectively. These findings open the door to designing more timely interventions. 
    more » « less
    Free, publicly-accessible full text available July 12, 2025
  2. Free, publicly-accessible full text available March 18, 2025
  3. Free, publicly-accessible full text available March 18, 2025
  4. Gaming the system is a persistent problem in Computer-Based Learning Platforms. While substantialprogress has been made in identifying and understanding such behaviors, effective interventions remainscarce. This study uses a method of causal moderation known as Fully Latent Principal Stratification toexplore the impact of two types of interventions – gamification and manipulation of assistance access –on the learning outcomes of students who tend to game the system. The results indicate that gamificationdoes not consistently mitigate these negative behaviors. One gamified condition had a consistentlypositive effect on learning regardless of students’ propensity to game the system, whereas the other had anegative effect on gamers. However, delaying access to hints and feedback may have a positive effect onthe learning outcomes of those gaming the system. This paper also illustrates the potential for integratingdetection and causal methodologies within educational data mining to evaluate effective responses to detectedbehaviors. 
    more » « less
  5. There have been numerous efforts documenting the effects of open science in existing papers; however, these efforts typically only consider the author's analyses and supplemental materials from the papers. While understanding the current rate of open science adoption is important, it is also vital that we explore the factors that may encourage such adoption. One such factor may be publishing organizations setting open science requirements for submitted articles: encouraging researchers to adopt more rigorous reporting and research practices. For example, within the education technology discipline, theACM Conference on Learning @ Scale (L@S) has been promoting open science practices since 2018 through a Call For Papers statement. The purpose of this study was to replicate previous papers within the proceedings of L@S and compare the degree of open science adoption and robust reproducibility practices to other conferences in education technology without a statement on open science. Specifically, we examined 93 papers and documented the open science practices used. We then attempted to reproduce the results with invitation from authors to bolster the chance of success. Finally, we compared the overall adoption rates to those from other conferences in education technology. Although the overall responses to the survey were low, our cursory review suggests that researchers at L@S might be more familiar with open science practices compared to the researchers who published in the International Conference on Artificial Intelligence in Education (AIED) and the International Conference on Educational Data Mining (EDM): 13 of 28 AIED and EDM responses were unfamiliar with preregistrations and 7 unfamiliar with preprints, while only 2 of 7 L@S responses were unfamiliar with preregistrations and 0 with preprints. The overall adoption of open science practices at L@S was much lower with only 1% of papers providing open data, 5% providing open materials, and no papers had a preregistration. All openly accessible work can be found in an Open Science Framework project. 
    more » « less
  6. Solving mathematical problems is cognitively complex, involving strategy formulation, solution development, and the application of learned concepts. However, gaps in students' knowledge or weakly grasped concepts can lead to errors. Teachers play a crucial role in predicting and addressing these difficulties, which directly influence learning outcomes. However, preemptively identifying misconceptions leading to errors can be challenging. This study leverages historical data to assist teachers in recognizing common errors and addressing gaps in knowledge through feedback. We present a longitudinal analysis of incorrect answers from the 2015-2020 academic years on two curricula, Illustrative Math and EngageNY, for grades 6, 7, and 8. We find consistent errors across 5 years despite varying student and teacher populations. Based on these Common Wrong Answers (CWAs), we designed a crowdsourcing platform for teachers to provide Common Wrong Answer Feedback (CWAF). This paper reports on an in vivo randomized study testing the effectiveness of CWAFs in two scenarios: next-problem-correctness within-skill and next-problem-correctness within-assignment, regardless of the skill. We find that receiving CWAF leads to a significant increase in correctness for consecutive problems within-skill. However, the effect was not significant for all consecutive problems within-assignment, irrespective of the associated skill. This paper investigates the potential of scalable approaches in identifying Common Wrong Answers (CWAs) and how the use of crowdsourced CWAFs can enhance student learning through remediation. 
    more » « less
  7. Abstract

    Randomized controlled trials (RCTs) admit unconfounded design-based inference – randomization largely justifies the assumptions underlying statistical effect estimates – but often have limited sample sizes. However, researchers may have access to big observational data on covariates and outcomes from RCT nonparticipants. For example, data from A/B tests conducted within an educational technology platform exist alongside historical observational data drawn from student logs. We outline a design-based approach to using such observational data for variance reduction in RCTs. First, we use the observational data to train a machine learning algorithm predicting potential outcomes using covariates and then use that algorithm to generate predictions for RCT participants. Then, we use those predictions, perhaps alongside other covariates, to adjust causal effect estimates with a flexible, design-based covariate-adjustment routine. In this way, there is no danger of biases from the observational data leaking into the experimental estimates, which are guaranteed to be exactly unbiased regardless of whether the machine learning models are “correct” in any sense or whether the observational samples closely resemble RCT samples. We demonstrate the method in analyzing 33 randomized A/B tests and show that it decreases standard errors relative to other estimators, sometimes substantially.

     
    more » « less
  8. The development and application of deep learning method- ologies has grown within educational contexts in recent years. Perhaps attributable, in part, to the large amount of data that is made avail- able through the adoption of computer-based learning systems in class- rooms and larger-scale MOOC platforms, many educational researchers are leveraging a wide range of emerging deep learning approaches to study learning and student behavior in various capacities. Variations of recurrent neural networks, for example, have been used to not only pre- dict learning outcomes but also to study sequential and temporal trends in student data; it is commonly believed that they are able to learn high- dimensional representations of learning and behavioral constructs over time, such as the evolution of a students’ knowledge state while working through assigned content. Recent works, however, have started to dis- pute this belief, instead finding that it may be the model’s complexity that leads to improved performance in many prediction tasks and that these methods may not inherently learn these temporal representations through model training. In this work, we explore these claims further in the context of detectors of student affect as well as expanding on exist- ing work that explored benchmarks in knowledge tracing. Specifically, we observe how well trained models perform compared to deep learning networks where training is applied only to the output layer. While the highest results of prior works utilizing trained recurrent models are found to be superior, the application of our untrained-versions perform compa- rably well, outperforming even previous non-deep learning approaches. 
    more » « less
  9. As computer-based learning platforms have become ubiq- uitous, there is a growing need to better support teachers. Particularly in mathematics, teachers often rely on open- ended questions to assess students’ understanding. While prior works focusing on the development of automated open- ended work assessments have demonstrated their potential, many of those methods require large amounts of student data to make reliable estimates. We explore whether a prob- lem specific automated scoring model could benefit from auxiliary data collected from similar problems to address this “cold start” problem. We examine factors such as sam- ple size and the magnitude of similarity of utilized problem data. We find the use of data from similar problems not only provides benefits to improve predictive performance by in- creasing sample size, but also leads to greater overall model performance than using data solely from the original prob- lem when sample size is held constant. 
    more » « less
  10. Prior works have led to the development and application of automated assessment methods that leverage machine learning and nat- ural language processing. The performance of these methods have often been reported as being positive, but other prior works have identified aspects on which they may be improved. Particularly in the context of mathematics, the presence of non-linguistic characters and expressions have been identified to contribute to observed model error. In this paper, we build upon this prior work by observing a developed automated as- sessment model for open-response questions in mathematics. We develop a new approach which we call the “Math Term Frequency” (MTF) model to address this issue caused by the presence of non-linguistic terms and ensemble it with the previously-developed assessment model. We observe that the inclusion of this approach notably improves model performance, and present an example of practice of how error analyses can be leveraged to address model limitations. 
    more » « less