skip to main content


Search for: All records

Award ID contains: 1631556

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. null (Ed.)
    As students read textbooks, they often highlight the material they deem to be most important. We analyze students’ highlights to predict their subsequent performance on quiz questions. Past research in this area has encoded highlights in terms of where the highlights appear in the stream of text—a positional representation. In this work, we construct a semantic representation based on a state-of-the-art deep-learning sentence embedding technique (SBERT) that captures the content-based similarity between quiz questions and highlighted (as well as non-highlighted) sentences in the text. We construct regression models that include latent variables for student skill level and question difficulty and augment the models with highlighting features. We find that highlighting features reliably boost model performance. We conduct experiments that validate models on held-out questions, students, and student-questions and find strong generalization for the latter two but not for held-out questions. Surprisingly, highlighting features improve models for questions at all levels of the Bloom taxonomy, from straightforward recall questions to inferential synthesis/evaluation/creation questions. 
    more » « less
  2. Rafferty, A. ; Whitehall, J. ; Cristobal, R. ; Cavalli-Sforza, V. (Ed.)
    We propose VarFA, a variational inference factor analysis framework that extends existing factor analysis models for educational data mining to efficiently output uncertainty estimation in the model's estimated factors. Such uncertainty information is useful, for example, for an adaptive testing scenario, where additional tests can be administered if the model is not quite certain about a students' skill level estimation. Traditional Bayesian inference methods that produce such uncertainty information are computationally expensive and do not scale to large data sets. VarFA utilizes variational inference which makes it possible to efficiently perform Bayesian inference even on very large data sets. We use the sparse factor analysis model as a case study and demonstrate the efficacy of VarFA on both synthetic and real data sets. VarFA is also very general and can be applied to a wide array of factor analysis models. 
    more » « less
  3. The ever growing amount of educational content renders it increasingly difficult to manually generate sufficient practice or quiz questions to accompany it. This paper introduces QG-Net, a recurrent neural network-based model specifically designed for automatically generating quiz questions from educational content such as textbooks. QG-Net, when trained on a publicly available, general-purpose question/answer dataset and without further fine-tuning, is capable of generating high quality questions from textbooks, where the content is significantly different from the training data. Indeed, QG-Net outperforms state-of-the-art neural network-based and rules-based systems for question generation, both when evaluated using standard benchmark datasets and when using human evaluators. QG-Net also scales favorably to applications with large amounts of educational content, since its performance improves with the amount of training data. 
    more » « less