skip to main content


Title: Leveraging natural language processing to support automated assessment and feedback for student open responses in mathematics
Abstract Background

Teachers often rely on the use of open‐ended questions to assess students' conceptual understanding of assigned content. Particularly in the context of mathematics; teachers use these types of questions to gain insight into the processes and strategies adopted by students in solving mathematical problems beyond what is possible through more close‐ended problem types. While these types of problems are valuable to teachers, the variation in student responses to these questions makes it difficult, and time‐consuming, to evaluate and provide directed feedback. It is a well‐studied concept that feedback, both in terms of a numeric score but more importantly in the form of teacher‐authored comments, can help guide students as to how to improve, leading to increased learning. It is for this reason that teachers need better support not only for assessing students' work but also in providing meaningful and directed feedback to students.

Objectives

In this paper, we seek to develop, evaluate, and examine machine learning models that support automated open response assessment and feedback.

Methods

We build upon the prior research in the automatic assessment of student responses to open‐ended problems and introduce a novel approach that leverages student log data combined with machine learning and natural language processing methods. Utilizing sentence‐level semantic representations of student responses to open‐ended questions, we propose a collaborative filtering‐based approach to both predict student scores as well as recommend appropriate feedback messages for teachers to send to their students.

Results and Conclusion

We find that our method outperforms previously published benchmarks across three different metrics for the task of predicting student performance. Through an error analysis, we identify several areas where future works may be able to improve upon our approach.

 
more » « less
NSF-PAR ID:
10415601
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Journal of Computer Assisted Learning
Volume:
39
Issue:
3
ISSN:
0266-4909
Page Range / eLocation ID:
p. 823-840
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Background: Teachers often rely on the use of open‐ended questions to assess students' conceptual understanding of assigned content. Particularly in the context of mathematics; teachers use these types of questions to gain insight into the processes and strategies adopted by students in solving mathematical problems beyond what is possible through more close‐ended problem types. While these types of problems are valuable to teachers, the variation in student responses to these questions makes it difficult, and time‐consuming, to evaluate and provide directed feedback. It is a well‐studied concept that feedback, both in terms of a numeric score but more importantly in the form of teacher‐authored comments, can help guide students as to how to improve, leading to increased learning. It is for this reason that teachers need better support not only for assessing students' work but also in providing meaningful and directed feedback to students. Objectives: In this paper, we seek to develop, evaluate, and examine machine learning models that support automated open response assessment and feedback. Methods: We build upon the prior research in the automatic assessment of student responses to open‐ended problems and introduce a novel approach that leverages student log data combined with machine learning and natural language processing methods. Utilizing sentence‐level semantic representations of student responses to open‐ended questions, we propose a collaborative filtering‐based approach to both predict student scores as well as recommend appropriate feedback messages for teachers to send to their students. Results and Conclusion: We find that our method outperforms previously published benchmarks across three different metrics for the task of predicting student performance. Through an error analysis, we identify several areas where future works maybe able to improve upon our approach. 
    more » « less
  2. Advancements in online learning platforms have revolutionized education in multiple different ways, transforming the learning experiences and instructional practices. The development of natural language processing and machine learning methods have helped understand and process student languages, comprehend their learning state, and build automated supports for teachers. With this, there has been a growing body of research in developing automated methods to assess students’ work both in mathematical and nonmathematical domains. These automated methods address questions of two categories; closed-ended (with limited correct answers) and open-ended (are often subjective and have multiple correct answers), where open-ended questions are mostly used by teachers to learn about their student’s understanding of a particular concept. Manually assessing and providing feedback to these open-ended questions is often arduous and time-consuming for teachers. For this reason, there have been several works to understand student responses to these open-ended questions to automate the assessment and provide constructive feedback to students. In this research, we seek to improve such a prior method for assessment and feedback suggestions for student open-ended works in mathematics. For this, we present an error analysis of the prior method ”SBERT-Canberra” for auto-scoring, explore various factors that contribute to the error of the method, and propose solutions to improve upon the method by addressing these error factors. We further intend to expand this approach by improving feedback suggestions for teachers to give to their students’ open-ended work. 
    more » « less
  3. Teachers often rely on the use of a range of open-ended problems to assess students' understanding of mathematical concepts. Beyond traditional conceptions of student open-ended work, commonly in the form of textual short-answer or essay responses, the use of figures, tables, number lines, graphs, and pictographs are other examples of open-ended work common in mathematics. While recent developments in areas of natural language processing and machine learning have led to automated methods to score student open-ended work, these methods have largely been limited to textual answers. Several computer-based learning systems allow students to take pictures of hand-written work and include such images within their answers to open-ended questions. With that, however, there are few-to-no existing solutions that support the auto-scoring of student hand-written or drawn answers to questions. In this work, we build upon an existing method for auto-scoring textual student answers and explore the use of OpenAI/CLIP, a deep learning embedding method designed to represent both images and text, as well as Optical Character Recognition (OCR) to improve model performance. We evaluate the performance of our method on a dataset of student open-responses that contains both text- and image-based responses, and find a reduction of model error in the presence of images when controlling for other answer-level features. 
    more » « less
  4. Teachers often rely on the use of a range of open-ended problems to assess students’ understanding of mathematical concepts. Beyond traditional conceptions of student openended work, commonly in the form of textual short-answer or essay responses, the use of figures, tables, number lines, graphs, and pictographs are other examples of open-ended work common in mathematics. While recent developments in areas of natural language processing and machine learning have led to automated methods to score student open-ended work, these methods have largely been limited to textual answers. Several computer-based learning systems allow students to take pictures of hand-written work and include such images within their answers to open-ended questions. With that, however, there are few-to-no existing solutions that support the auto-scoring of student hand-written or drawn answers to questions. In this work, we build upon an existing method for auto-scoring textual student answers and explore the use of OpenAI/CLIP, a deep learning embedding method designed to represent both images and text, as well as Optical Character Recognition (OCR) to improve model performance. We evaluate the performance of our method on a dataset of student open-responses that contains both text- and image-based responses, and find a reduction of model error in the presence of images when controlling for other answer-level features. 
    more » « less
  5. Teachers often rely on the use of a range of open-ended problems to assess students’ understanding of mathematical concepts. Beyond traditional conceptions of student open- ended work, commonly in the form of textual short-answer or essay responses, the use of figures, tables, number lines, graphs, and pictographs are other examples of open-ended work common in mathematics. While recent developments in areas of natural language processing and machine learning have led to automated methods to score student open-ended work, these methods have largely been limited to textual an- swers. Several computer-based learning systems allow stu- dents to take pictures of hand-written work and include such images within their answers to open-ended questions. With that, however, there are few-to-no existing solutions that support the auto-scoring of student hand-written or drawn answers to questions. In this work, we build upon an ex- isting method for auto-scoring textual student answers and explore the use of OpenAI/CLIP, a deep learning embedding method designed to represent both images and text, as well as Optical Character Recognition (OCR) to improve model performance. We evaluate the performance of our method on a dataset of student open-responses that contains both text- and image-based responses, and find a reduction of model error in the presence of images when controlling for other answer-level features. 
    more » « less