skip to main content

Search for: All records

Award ID contains: 1934745

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. We report work-in-progress that aims to better understand prediction performance differences between Deep Knowledge Tracing (DKT) and Bayesian Knowledge Tracing (BKT) as well as “gaming the system” behavior by considering variation in features and design across individual pieces of instructional content. Our“non-monolithic”analysis considers hundreds of “workspaces” in Carnegie Learning’s MATHia intelligent tutoring system and the extent to which two relatively simple features extracted from MATHia logs, potentially related to gaming the system behavior, are correlated with differences in DKT and BKT prediction performance. We then take a closer look at a set of six MATHia workspaces, three of which represent content in which DKT out-performs BKT and three of which represent content in which BKT out-performs DKT or there is little difference in performance between the approaches. We present some preliminary findings related to the extent to which students game the system in these workspaces, across two school years, as well as other facets of variability across these pieces of instructional content. We conclude with a road map for scaling these analyses over much larger sets of MATHia workspaces and learner data.
    Free, publicly-accessible full text available July 1, 2023
  2. Multi-angle question answering models have recently been proposed that promise to perform related tasks like question generation. However, performance on related tasks has not been thoroughly studied. We investigate a leading model called Macaw on the task of multiple choice question generation and evaluate its performance on three angles that systematically reduce the complexity of the task. Our results indicate that despite the promise of generalization, Macaw performs poorly on untrained angles. Even on a trained angle, Macaw fails to generate four distinct multiple-choice options on 17% of inputs. We propose augmenting multiple choice options by paraphrasing angle input and show this increases overall success to 97.5%. A human evaluation comparing the augmented multiple-choice questions with textbook questions on the same topic reveals that Macaw questions broadly score highly but below human questions.
    Free, publicly-accessible full text available July 1, 2023
  3. In intelligent tutoring systems (ITS) abundant supportive messages are provided to learners. One implicit assumption behind this design is that learners would actively process and benefit from feedback messages when interacting with ITS individually. However, this is not true for all learners; some gain little after numerous practice opportunities. In the current research, we assume that if the learner invests enough cognitive effort to review feedback messages provided by the system, the learner’s performance should be improved as practice opportunities accumulate. We expect that the learner’s cognitive effort investment could be reflected to some extent by the response latency, then the learner’s improvement should also be correlated with the response latency. Therefore, based on this core hypothesis, we conduct a cluster analysis by exploring features relevant to learners’ response latency. We expect to find several features that could be used as indicators of the feedback usage of learners; consequently, these features may be used to predict learners’ learning gain in future research. Our results suggest that learners’ prior knowledge level plays a role when interacting with ITS and different patterns of response latency. Learners with higher prior knowledge levels tend to interact flexibly with the system and use feedback messagesmore »more effectively. The quality of their previous attempts influences their response latency. However, learners with lower prior knowledge perform two opposite patterns, some tend to respond more quickly, and some tend to respond more slowly. One common characteristic of these learners is their incorrect response latency is not influenced by the quality of their previous performance. One interesting result is that those quick responders forget faster. Thus, we concluded that for learners with lower prior knowledge, it is better for them not to react hastily to obtain a more durable memory.« less
    Free, publicly-accessible full text available July 1, 2023
  4. Understanding how students with varying capabilities think about problem solving can greatly help in improving personalized education which can have significantly better learning outcomes. Here, we present the details of a system we call NeTra that we developed for discovering strategies that students follow in the context of Math learning. Specifically, we developed this system from large-scale data from MATHia that contains millions of student-tutor interactions. The goal of this system is to provide a visual interface for educators to understand the likely strategy the student will follow for problems that students are yet to attempt. This predictive interface can help educators/tutors to develop interventions that are personalized for students. Underlying the system is a powerful AI model based on Neuro-Symbolic learning that has shown promising results in predicting both strategies and the mastery over concepts used in the strategy.
    Free, publicly-accessible full text available July 1, 2023
  5. A longstanding goal of learner modeling and educational data min-ing is to improve the domain model of knowledge that is used to make inferences about learning and performance. In this report we present a tool for finding domain models that is built into an exist-ing modeling framework, logistic knowledge tracing (LKT). LKT allows the flexible specification of learner models in logistic re-gression by allowing the modeler to select whatever features of the data are relevant to prediction. Each of these features (such as the count of prior opportunities) is a function computed for a compo-nent of data (such as a student or knowledge component). In this context, we have developed the “autoKC” component, which clus-ters knowledge components and allows the modeler to compute features for the clustered components. For an autoKC, the input component (initial KC or item assignment) is clustered prior to computing the feature and the feature is a function of that cluster. Another recent new function for LKT, which allows us to specify interactions between the logistic regression predictor terms, is com-bined with autoKC for this report. Interactions allow us to move beyond just assuming the cluster information has additive effects to allow us to model situationsmore »where a second factor of the data mod-erates a first factor.« less
    Free, publicly-accessible full text available July 1, 2023
  6. Using archived student data for middle and high school students’ mathematics-focused intelligent tutoring system (ITS) learning collected across a school year, this study explores situational, achievement-goal latent profile membership and the stability of these profiles with respect to student demographics and dispositional achievement goal scores. Over 65% of students changed situational profile membership at some time during the school year. Start-of-year dispositional motivation scores were not related to whether students remained in the same profile across all unit-level measurements. Grade level was predictive of profile stability. Findings from the present study should shed light on how in-the-moment student motivation fluctuates while students are engaged in ITS math learning. Present findings have potential to inform motivation interventions designed for ITS math learning.
    Free, publicly-accessible full text available July 1, 2023
  7. This paper provides an update of the Learner Data Institute (LDI; www.learnerdatainstitute.org) which is now in its third year since conceptualization. Funded as a conceptualization project, the LDI’s first two years had two major goals: (1) develop, implement, evaluate, and refine a framework for data-intensive science and engineering and (2) use the framework to start developing prototype solutions, based on data, data science, and science convergence, to a number of core challenges in learning science and engineering. One major focus in the third, current year is synthesizing efforts from the first two years to identify new opportunities for future research by various mutual interest groups within LDI, which have focused on developing a particular prototype solution to one or more related core challenges in learning science and engineering. In addition to highlighting emerging data-intensive solutions and innovations from the LDI’s first two years, including places where LDI researchers have received additional funding for future research, we highlight here various core challenges our team has identified as being at a “tipping point.” Tipping point challenges are those for which timely investment in data-intensive approaches has the maximum potential for a transformative effect.
    Free, publicly-accessible full text available July 1, 2023
  8. We present a brief case study of a multi-year learning engineering effort to iteratively redesign the problem-solving experience of students using the “Solving Quadratic Equations” workspace in Carnegie Learning’s MATHia intelligent tutoring system. We consider two design changes, one involving additional scaffolds for the problem-solving task and the next involving a “nudge” for learners to more rapidly and readily engage with these scaffolds and discuss resulting changes in the relative proportion of students who fail to master skills associated with this workspace over the course of two school years.
  9. This paper provides a progress report on the first 18 months of Phase 1, the conceptualization phase, of the Learner Data Institute (LDI; www.learnerdatainstitute.org). LDI is currently in Phase 1, the conceptualization phase, to be followed by Phase 2, the institute or convergence phase. The current 2-year conceptualization phase has two major goals: (1) develop, implement, evaluate, and refine a framework for data-intensive science and engineering for the future institute, and (2) use the framework to provide prototype solutions, based on data, data science, and science convergence, to a number of core challenges in learning science and engineering. By targeting a critical mass of key challenges that are at a tipping point, LDI aims to start a chain reaction that will transform the whole learning ecosystem. We will emphasize here the key elements of the LDI science convergence framework that our team developed, implemented, and now is in the process of evaluating and refining. We highlight important outcomes of the convergence framework and related processes, including a 5-year plan for the institute phase and data-intensive prototype solutions to transform the learning ecosystem.
  10. We consider a general framework for constructing non-linear generators by adding a (32-bit or larger) pseudo-random number generator (PRNG) as a baseline generator to the basic RC4 design, in which an index-selection scheme similar to RC4 is used. We refer to the proposed design as the eRC (enhanced/ extended RC4) design. We discuss several advantages of adding a good baseline generator to the RC4 design, including new updating schemes for the auxiliary table. We consider some popular PRNGs with the nice properties of high-dimensional equi-distribution, efficiency, long period, and portability as the baseline generator. We demonstrate that eRC generators are very efficient via extensive empirical testing on some eRC generators. We also show that eRC is flexible enough to choose minimal design parameters for eRC generators and yet the resulting eRC generators still pass stringent empirical tests, which makes them suitable for both software and hardware implementations.