skip to main content

This content will become publicly available on July 1, 2022

Title: Using semantics of textbook highlights to predict student comprehension and knowledge retention
As students read textbooks, they often highlight the material they deem to be most important. We analyze students’ highlights to predict their subsequent performance on quiz questions. Past research in this area has encoded highlights in terms of where the highlights appear in the stream of text—a positional representation. In this work, we construct a semantic representation based on a state-of-the-art deep-learning sentence embedding technique (SBERT) that captures the content-based similarity between quiz questions and highlighted (as well as non-highlighted) sentences in the text. We construct regression models that include latent variables for student skill level and question difficulty and augment the models with highlighting features. We find that highlighting features reliably boost model performance. We conduct experiments that validate models on held-out questions, students, and student-questions and find strong generalization for the latter two but not for held-out questions. Surprisingly, highlighting features improve models for questions at all levels of the Bloom taxonomy, from straightforward recall questions to inferential synthesis/evaluation/creation questions.
Authors:
; ; ;
Editors:
Sosnovsky, S.; Brusilovsky, P; Baraniuk, R. G.; Lan, A. S.
Award ID(s):
1631428
Publication Date:
NSF-PAR ID:
10299595
Journal Name:
Proceedings of the Third International Workshop on Intelligent Textbooks (iTextbooks)
Page Range or eLocation-ID:
108-120
Sponsoring Org:
National Science Foundation
More Like this
  1. As students read textbooks, they often highlight the material they deem to be most important. We analyze students’ highlights to predict their subsequent performance on quiz questions. Past research in this area has encoded highlights in terms of where the highlights appear in the stream of text—a positional representation. In this work, we construct a semantic representation based on a state-of-the-art deep-learning sentence embedding technique (SBERT) that captures the content-based similarity between quiz questions and highlighted (as well as non-highlighted) sentences in the text. We construct regression models that include latent variables for student skill level and question difficulty andmore »augment the models with highlighting features. We find that highlighting features reliably boost model performance. We conduct experiments that validate models on held-out questions, students, and student-questions and find strong generalization for the latter two but not for held-out questions. Surprisingly, highlighting features improve models for questions at all levels of the Bloom taxonomy, from straightforward recall questions to inferential synthesis/evaluation/creation questions.« less
  2. When engaging with a textbook, students are inclined to highlight key content. Although students believe that highlighting and subsequent review of the highlights will further their educational goals, the psychological literature provides no evidence of benefits. Nonetheless, a student’s choice of text for highlighting may serve as a window into their mental state—their level of comprehension, grasp of the key ideas, reading goals, etc. We explore this hypothesis via an experiment in which 198 participants read sections from a college-level biology text, briefly reviewed the text, and then took a quiz on the material. During initial reading, participants were ablemore »to highlight words, phrases, and sentences, and these highlights were displayed along with the complete text during the subsequent review. Consistent with past research, the amount of highlighted material is unrelated to quiz performance. However, our main goal is to examine highlighting as a data source for inferring student understanding. We explored multiple representations of the highlighting patterns and tested Bayesian linear regression and neural network models, but we found little or no relationship between a student’s highlights and quiz performance. Our long-term goal is to design digital textbooks that serve not only as conduits of information into the mind of the reader, but also allow us to draw inferences about the reader at a point where interventions may increase the effectiveness of the material.« less
  3. We investigate whether student comprehension and knowledge retention can be predicted from textbook annotations, specifically the material that students choose to highlight. Using a digital open-access textbook platform, Openstax, students enrolled in Biology, Physics, and Sociology courses read sections of their introductory text as part of required coursework, optionally highlighted the text to flag key material, and then took brief quizzes as the end of each section. We find that when students choose to highlight, the specific pattern of highlights can explain about 13% of the variance in observed quiz scores. We explore many different representations of the pattern ofmore »highlights and discover that a low-dimensional logistic principal component based vector is most effective as input to a ridge regression model. Considering the many sources of uncontrolled variability affecting student performance, we are encouraged by the strong signal that highlights provide as to a student’s knowledge state.« less
  4. 1. Description of the objectives and motivation for the contribution to ECE education The demand for wireless data transmission capacity is increasing rapidly and this growth is expected to continue due to ongoing prevalence of cellular phones and new and emerging bandwidth-intensive applications that encompass high-definition video, unmanned aerial systems (UAS), intelligent transportation systems (ITS) including autonomous vehicles, and others. Meanwhile, vital military and public safety applications also depend on access to the radio frequency spectrum. To meet these demands, the US federal government is beginning to move from the proven but inefficient model of exclusive frequency assignments to amore »more-efficient, shared-spectrum approach in some bands of the radio frequency spectrum. A STEM workforce that understands the radio frequency spectrum and applications that use the spectrum is needed to further increase spectrum efficiency and cost-effectiveness of wireless systems over the next several decades to meet anticipated and unanticipated increases in wireless data capacity. 2. Relevant background including literature search examples if appropriate CISCO Systems’ annual survey indicates continued strong growth in demand for wireless data capacity. Meanwhile, undergraduate electrical and computer engineering courses in communication systems, electromagnetics, and networks tend to emphasize mathematical and theoretical fundamentals and higher-layer protocols, with less focus on fundamental concepts that are more specific to radio frequency wireless systems, including the physical and media access control layers of wireless communication systems and networks. An efficient way is needed to introduce basic RF system and spectrum concepts to undergraduate engineering students in courses such as those mentioned above who are unable to, or had not planned to take a full course in radio frequency / microwave engineering or wireless systems and networks. We have developed a series of interactive online modules that introduce concepts fundamental to wireless communications, the radio frequency spectrum, and spectrum sharing, and seek to present these concepts in context. The modules include interactive, JavaScript-based simulation exercises intended to reinforce the concepts that are presented in the modules through narrated slide presentations, text, and external links. Additional modules in development will introduce advanced undergraduate and graduate students and STEM professionals to configuration and programming of adaptive frequency-agile radios and spectrum management systems that can operate efficiently in congested radio frequency environments. Simulation exercises developed for the advanced modules allow both manual and automatic control of simulated radio links in timed, game-like simulations, and some exercises will enable students to select from among multiple pre-coded controller strategies and optionally edit the code before running the timed simulation. Additionally, we have developed infrastructure for running remote laboratory experiments that can also be embedded within the online modules, including a web-based user interface, an experiment management framework, and software defined radio (SDR) application software that runs in a wireless testbed initially developed for research. Although these experiments rely on limited hardware resources and introduce additional logistical considerations, they provide additional realism that may further challenge and motivate students. 3. Description of any assessment methods used to evaluate the effectiveness of the contribution, Each set of modules is preceded and followed by a survey. Each individual module is preceded by a quiz and followed by another quiz, with pre- and post-quiz questions drawn from the same pool. The pre-surveys allow students to opt in or out of having their survey and quiz results used anonymously in research. 4. Statement of results. The initial modules have been and are being used by three groups of students: (1) students in an undergraduate Introduction to Communication Systems course; (2) an interdisciplinary group of engineering students, including computer science students, who are participating in related undergraduate research project; and (3) students in a graduate-level communications course that includes both electrical and computer engineers. Analysis of results from the first group of students showed statistically significant increases from pre-quiz to post-quiz for each of four modules on fundamental wireless communication concepts. Results for the other students have not yet been analyzed, but also appear to show substantial pre-quiz to post-quiz increases in mean scores.« less
  5. This study examined the difficulty introduced by spaced retrieval practice in Calculus I for undergraduate engineering students. Spaced retrieval practice is an instructional technique in which students engage in multiple recall exercises on the same topic with intermittent temporal delays in between. Spacing out retrieval practice increases the difficulty of the exercises, reducing student performance on them. However, empirical research indicates that spaced retrieval practice is associated with improvements in students’ long-term memory for the retrieved information. The short-term costs and long-term benefits of spaced retrieval practice is an example of desirable difficulty, when more difficult exercises during the earlymore »stages of learning result in longer-lasting memory [1]. With support from the National Science Foundation (NSF), we sought to address: Does spacing decrease performance on retrieval practice exercises in an engineering mathematics course? Results showed that student performance was significantly lower for questions in the spaced condition than questions in the massed condition, indicating that we successfully increased the difficulty of the questions by spacing them out over time. Future work will assess final quiz performance to determine whether spacing improved long-term course performance, i.e., whether the difficulty imposed by spacing was desirable.« less