skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Thursday, June 12 until 2:00 AM ET on Friday, June 13 due to maintenance. We apologize for the inconvenience.


Title: Using semantics of textbook highlights to predict student comprehension and knowledge retention
As students read textbooks, they often highlight the material they deem to be most important. We analyze students’ highlights to predict their subsequent performance on quiz questions. Past research in this area has encoded highlights in terms of where the highlights appear in the stream of text—a positional representation. In this work, we construct a semantic representation based on a state-of-the-art deep-learning sentence embedding technique (SBERT) that captures the content-based similarity between quiz questions and highlighted (as well as non-highlighted) sentences in the text. We construct regression models that include latent variables for student skill level and question difficulty and augment the models with highlighting features. We find that highlighting features reliably boost model performance. We conduct experiments that validate models on held-out questions, students, and student-questions and find strong generalization for the latter two but not for held-out questions. Surprisingly, highlighting features improve models for questions at all levels of the Bloom taxonomy, from straightforward recall questions to inferential synthesis/evaluation/creation questions.  more » « less
Award ID(s):
1631428
PAR ID:
10299595
Author(s) / Creator(s):
; ; ;
Editor(s):
Sosnovsky, S.; Brusilovsky, P; Baraniuk, R. G.; Lan, A. S.
Date Published:
Journal Name:
Proceedings of the Third International Workshop on Intelligent Textbooks (iTextbooks)
Page Range / eLocation ID:
108-120
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    As students read textbooks, they often highlight the material they deem to be most important. We analyze students’ highlights to predict their subsequent performance on quiz questions. Past research in this area has encoded highlights in terms of where the highlights appear in the stream of text—a positional representation. In this work, we construct a semantic representation based on a state-of-the-art deep-learning sentence embedding technique (SBERT) that captures the content-based similarity between quiz questions and highlighted (as well as non-highlighted) sentences in the text. We construct regression models that include latent variables for student skill level and question difficulty and augment the models with highlighting features. We find that highlighting features reliably boost model performance. We conduct experiments that validate models on held-out questions, students, and student-questions and find strong generalization for the latter two but not for held-out questions. Surprisingly, highlighting features improve models for questions at all levels of the Bloom taxonomy, from straightforward recall questions to inferential synthesis/evaluation/creation questions. 
    more » « less
  2. When engaging with a textbook, students are inclined to highlight key content. Although students believe that highlighting and subsequent review of the highlights will further their educational goals, the psychological literature provides no evidence of benefits. Nonetheless, a student’s choice of text for highlighting may serve as a window into their mental state—their level of comprehension, grasp of the key ideas, reading goals, etc. We explore this hypothesis via an experiment in which 198 participants read sections from a college-level biology text, briefly reviewed the text, and then took a quiz on the material. During initial reading, participants were able to highlight words, phrases, and sentences, and these highlights were displayed along with the complete text during the subsequent review. Consistent with past research, the amount of highlighted material is unrelated to quiz performance. However, our main goal is to examine highlighting as a data source for inferring student understanding. We explored multiple representations of the highlighting patterns and tested Bayesian linear regression and neural network models, but we found little or no relationship between a student’s highlights and quiz performance. Our long-term goal is to design digital textbooks that serve not only as conduits of information into the mind of the reader, but also allow us to draw inferences about the reader at a point where interventions may increase the effectiveness of the material. 
    more » « less
  3. We investigate whether student comprehension and knowledge retention can be predicted from textbook annotations, specifically the material that students choose to highlight. Using a digital open-access textbook platform, Openstax, students enrolled in Biology, Physics, and Sociology courses read sections of their introductory text as part of required coursework, optionally highlighted the text to flag key material, and then took brief quizzes as the end of each section. We find that when students choose to highlight, the specific pattern of highlights can explain about 13% of the variance in observed quiz scores. We explore many different representations of the pattern of highlights and discover that a low-dimensional logistic principal component based vector is most effective as input to a ridge regression model. Considering the many sources of uncontrolled variability affecting student performance, we are encouraged by the strong signal that highlights provide as to a student’s knowledge state. 
    more » « less
  4. null (Ed.)
    This study examined the difficulty introduced by spaced retrieval practice in Calculus I for undergraduate engineering students. Spaced retrieval practice is an instructional technique in which students engage in multiple recall exercises on the same topic with intermittent temporal delays in between. Spacing out retrieval practice increases the difficulty of the exercises, reducing student performance on them. However, empirical research indicates that spaced retrieval practice is associated with improvements in students’ long-term memory for the retrieved information. The short-term costs and long-term benefits of spaced retrieval practice is an example of desirable difficulty, when more difficult exercises during the early stages of learning result in longer-lasting memory [1]. With support from the National Science Foundation (NSF), we sought to address: Does spacing decrease performance on retrieval practice exercises in an engineering mathematics course? Results showed that student performance was significantly lower for questions in the spaced condition than questions in the massed condition, indicating that we successfully increased the difficulty of the questions by spacing them out over time. Future work will assess final quiz performance to determine whether spacing improved long-term course performance, i.e., whether the difficulty imposed by spacing was desirable. 
    more » « less
  5. Bennett, M; Wolf, S.; Frank, B. W. (Ed.)
    Computer simulations for physics labs may be combined with hands-on lab equipment to boost student understanding and make labs more accessible. Hybrid labs of HTML5-based computer simulations and hands-on lab equipment for topics in mechanics were investigated in a large, algebra-based, studio physics course for life science students at a private, research-intensive institution. Computer simulations were combined with hands-on equipment and compared to traditional hands-on labs using an A/B testing protocol. Learning outcomes were measured for the specific topic of momentum conservation by comparing student scores on post-lab exercises, related quiz and exam questions, and a subset of questions on the Energy and Momentum Conceptual Survey (EMCS) administered before and after instruction for both groups. We find that students who completed a hands-on lab vs. a hybrid lab showed no difference in performance on momentum assessments. 
    more » « less