skip to main content


Title: Inferring student comprehension from highlighting patterns in digital textbooks: An exploration in an authentic learning platform
We investigate whether student comprehension and knowledge retention can be predicted from textbook annotations, specifically the material that students choose to highlight. Using a digital open-access textbook platform, Openstax, students enrolled in Biology, Physics, and Sociology courses read sections of their introductory text as part of required coursework, optionally highlighted the text to flag key material, and then took brief quizzes as the end of each section. We find that when students choose to highlight, the specific pattern of highlights can explain about 13% of the variance in observed quiz scores. We explore many different representations of the pattern of highlights and discover that a low-dimensional logistic principal component based vector is most effective as input to a ridge regression model. Considering the many sources of uncontrolled variability affecting student performance, we are encouraged by the strong signal that highlights provide as to a student’s knowledge state.  more » « less
Award ID(s):
1631428
NSF-PAR ID:
10197702
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Intelligent Textbooks 2020
Page Range / eLocation ID:
1-13
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. When engaging with a textbook, students are inclined to highlight key content. Although students believe that highlighting and subsequent review of the highlights will further their educational goals, the psychological literature provides no evidence of benefits. Nonetheless, a student’s choice of text for highlighting may serve as a window into their mental state—their level of comprehension, grasp of the key ideas, reading goals, etc. We explore this hypothesis via an experiment in which 198 participants read sections from a college-level biology text, briefly reviewed the text, and then took a quiz on the material. During initial reading, participants were able to highlight words, phrases, and sentences, and these highlights were displayed along with the complete text during the subsequent review. Consistent with past research, the amount of highlighted material is unrelated to quiz performance. However, our main goal is to examine highlighting as a data source for inferring student understanding. We explored multiple representations of the highlighting patterns and tested Bayesian linear regression and neural network models, but we found little or no relationship between a student’s highlights and quiz performance. Our long-term goal is to design digital textbooks that serve not only as conduits of information into the mind of the reader, but also allow us to draw inferences about the reader at a point where interventions may increase the effectiveness of the material. 
    more » « less
  2. Problem solving is a vital skill required to be successful in many engineering industries. One way for students to practice problem solving is through solving homework problems. However, solutions manuals for textbook problems are usually available online, and students can easily default to copying from solution manual. To address the solution manual dilemma and promote better problem-solving ability, this study utilizes novel homework problems that integrate a video component as an alternative to text-only, textbook problems. Building upon research showing visuals promote better learning, YouTube videos are reversed engineered by students to create new homework problems. Previous studies have catalogued student-written problems in a material and energy balance course, which are called YouTube problems. In this study, textbook homework problems were replaced with student-written YouTube problems. We additionally focused on examining learning attitudes after students solve YouTube problems. Data collection include attitudinal survey responses using a validated instrument called CLASS (Colorado Learning Attitudes about Science Survey). Students completed the survey at the beginning and end of the course. Analysis compared gains in attitudes for participants in the treatment groups. Mean overall attitude of participants undergoing YouTube intervention was improved by a normalized gain factor of 0.15 with a small effect size (Hedge’s g = 0.35). Improvement was most prominent in attitudes towards personal application and relation to real world connection with normalized gain of 0.49 and small effect size (Hedge’s g = 0.38). 
    more » « less
  3. null (Ed.)
    The state of the art knowledge tracing approaches mostly model student knowledge using their performance in assessed learning resource types, such as quizzes, assignments, and exercises, and ignore the non-assessed learning resources. However, many student activities are non-assessed, such as watching video lectures, participating in a discussion forum, and reading a section of a textbook, all of which potentially contributing to the students' knowledge growth. In this paper, we propose the  first novel deep learning based knowledge tracing model (DMKT) that explicitly model student's knowledge transitions over both assessed and non-assessed learning activities. With DMKT we can discover the underlying latent concepts of each non-assessed and assessed learning material and better predict the student performance in future assessed learning resources. We compare our proposed method with various state of the art knowledge tracing methods on four real-world datasets and show its effectiveness in predicting student performance, representing student knowledge, and discovering the underlying domain model. 
    more » « less
  4. null (Ed.)
    As students read textbooks, they often highlight the material they deem to be most important. We analyze students’ highlights to predict their subsequent performance on quiz questions. Past research in this area has encoded highlights in terms of where the highlights appear in the stream of text—a positional representation. In this work, we construct a semantic representation based on a state-of-the-art deep-learning sentence embedding technique (SBERT) that captures the content-based similarity between quiz questions and highlighted (as well as non-highlighted) sentences in the text. We construct regression models that include latent variables for student skill level and question difficulty and augment the models with highlighting features. We find that highlighting features reliably boost model performance. We conduct experiments that validate models on held-out questions, students, and student-questions and find strong generalization for the latter two but not for held-out questions. Surprisingly, highlighting features improve models for questions at all levels of the Bloom taxonomy, from straightforward recall questions to inferential synthesis/evaluation/creation questions. 
    more » « less
  5. Sosnovsky, S. ; Brusilovsky, P ; Baraniuk, R. G. ; Lan, A. S. (Ed.)
    As students read textbooks, they often highlight the material they deem to be most important. We analyze students’ highlights to predict their subsequent performance on quiz questions. Past research in this area has encoded highlights in terms of where the highlights appear in the stream of text—a positional representation. In this work, we construct a semantic representation based on a state-of-the-art deep-learning sentence embedding technique (SBERT) that captures the content-based similarity between quiz questions and highlighted (as well as non-highlighted) sentences in the text. We construct regression models that include latent variables for student skill level and question difficulty and augment the models with highlighting features. We find that highlighting features reliably boost model performance. We conduct experiments that validate models on held-out questions, students, and student-questions and find strong generalization for the latter two but not for held-out questions. Surprisingly, highlighting features improve models for questions at all levels of the Bloom taxonomy, from straightforward recall questions to inferential synthesis/evaluation/creation questions. 
    more » « less