skip to main content


Title: A comparison of different machine learning algorithms for predicting student performance in an online interactive mathematics game.
This paper demonstrated how to apply Machine Learning (ML) techniques to analyze student interaction data collected in an online mathematics game. Using a data-driven approach, we examined 1) how different ML algorithms influenced the precision of middle-school students’ (N = 359) performance (i.e. posttest math knowledge scores) prediction and 2) what types of in-game features (i.e. student in-game behaviors, math anxiety, mathematical strategies) were associated with student math knowledge scores. The results indicated that the Random Forest algorithm showed the best performance (i.e. the accuracy of models, error measures) in predicting posttest math knowledge scores among the seven algorithms employed. Out of 37 features included in the model, the validity of the students’ first mathematical transformation was the most predictive of their posttest math knowledge scores. Implications for game learning analytics and supporting students’ algebraic learning are discussed based on the findings.  more » « less
Award ID(s):
2142984
PAR ID:
10415168
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Interactive learning environments
ISSN:
1049-4820
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Hands-on experiments using the Low-Cost Desktop Learning Modules (LCDLMs) have been implemented in dozens of classrooms to supplement student learning of heat transfer and fluid mechanics concepts with students of varying prior knowledge. The prior knowledge of students who encounter these LCDLMs in the classroom may impact the degree to which students learn from these interactive pedagogies. This paper reports on the differences in student cognitive learning between groups with low and high prior knowledge of the concepts that are tested. Student conceptual test results for venturi, hydraulic loss, and double pipe heat exchanger LCDLMs are analyzed by grouping the student data into two bins based on pre-test score, one for students scoring below 50% and another for those scoring above and comparing the improvement from pretest to posttest between the two groups. The analysis includes data from all implementations of each LCDLM for the 2020-2021 school year. Results from each of the three LCDLMs were analyzed separately to compare student performance on different fluid mechanics or heat exchanger concepts. Then, the overall pre- and posttest scores for all three LCDLMs were analyzed to examine how this interactive pedagogy impacts cognitive gains. Results showed statistically significant differences in improvement between low prior knowledge groups and high prior knowledge groups. Additional findings showed statistically significant results suggesting that the gaps in performance between low prior knowledge and high prior knowledge groups on pre-tests for the LCDLMs were decreased on the posttest. Findings showed that students with lower prior knowledge show a greater overall improvement in cognitive gains than those with higher prior knowledge on all three low-cost desktop learning modules. 
    more » « less
  2. Hands-on experiments using the Low-Cost Desktop Learning Modules (LCDLMs) have been implemented in dozens of classrooms to supplement student learning of heat transfer and fluid mechanics concepts with students of varying prior knowledge. The prior knowledge of students who encounter these LCDLMs in the classroom may impact the degree to which students learn from these interactive pedagogies. This paper reports on the differences in student cognitive learning between groups with low and high prior knowledge of the concepts that are tested. Student conceptual test results for venturi, hydraulic loss, and double pipe heat exchanger LCDLMs are analyzed by grouping the student data into two bins based on pre-test score, one for students scoring below 50% and another for those scoring above and comparing the improvement from pretest to posttest between the two groups. The analysis includes data from all implementations of each LCDLM for the 2020-2021 school year. Results from each of the three LCDLMs were analyzed separately to compare student performance on different fluid mechanics or heat exchanger concepts. Then, the overall pre- and posttest scores for all three LCDLMs were analyzed to examine how this interactive pedagogy impacts cognitive gains. Results showed statistically significant differences in improvement between low prior knowledge groups and high prior knowledge groups. Additional findings showed statistically significant results suggesting that the gaps in performance between low prior knowledge and high prior knowledge groups on pre-tests for the LCDLMs were decreased on the posttest. Findings showed that students with lower prior knowledge show a greater overall improvement in cognitive gains than those with higher prior knowledge on all three low-cost desktop learning modules. 
    more » « less
  3. Despite theoretical benefits of replayability in educational games, empirical studies have found mixed evidence about the effects of replaying a previously passed game (i.e., elective replay) on students’ learning. Particularly, we know little about behavioral features of students’ elective replay process after experiencing failures (i.e., interruptive elective replay) and the relationships between these features and learning outcomes. In this study, we analyzed 5th graders’ log data from an educational game, ST Math, when they studied fractions—one of the most important but challenging math topics. We systematically constructed interruptive elective replay features by following students’ sequential behaviors after failing a game and investigated the relationships between these features and students’ post-test performance, after taking into account pretest performance and in-game performance. Descriptive statistics of the features we constructed revealed individual differences in the elective replay process after failures in terms of when to start replaying, what to replay, and how to replay. Moreover, a Bayesian multi-model linear regression showed that interruptive elective replay after failures might be beneficial for students if they chose to replay previously passed games when failing at a higher, more difficult level in the current game and if they passed the replayed games. 
    more » « less
  4. Abstract Background

    Game‐based learning can frame problem‐solving as a sense‐making experience with domain‐specific tasks for school students. However, multiple challenges arise when trying to support learners in such a complex, problem‐oriented learning environment.

    Objectives and Methods

    With an architecture‐themed mathematics learning game, we conducted two mixed‐method studies to explore the impact and design of game‐based mathematical experience on the math problem‐solving performance of middle school students.

    Results and Conclusions

    The study findings suggested a positive impact of game‐based math experience on math problem‐solving for middle school students. Problematization‐oriented game‐based math tasks with structuring features enhanced students' reasoning with problems and channelled it to doing mathematics.

    Takeaways

    The current research findings support the initiative to frame learning as a sense‐making experience with domain‐specific tasks and inform the design of game‐based mathematical experience and learning support.

     
    more » « less
  5. null (Ed.)
    Recent student knowledge modeling algorithms such as Deep Knowledge Tracing (DKT) and Dynamic Key-Value Memory Networks (DKVMN) have been shown to produce accurate predictions of problem correctness within the same learning system. However, these algorithms do not attempt to directly infer student knowledge. In this paper we present an extension to these algorithms to also infer knowledge. We apply this extension to DKT and DKVMN, resulting in knowledge estimates that correlate better with a posttest than knowledge estimates from Bayesian Knowledge Tracing (BKT), an algorithm designed to infer knowledge, and another classic algorithm, Performance Factors Analysis (PFA). We also apply our extension to correctness predictions from BKT and PFA, finding that knowledge estimates produced with it correlate better with the posttest than BKT and PFA’s standard knowledge estimates. These findings are significant since the primary aim of education is to prepare students for later experiences outside of the immediate learning activity. 
    more » « less