Automated Assessment in Math Education: A Comparative Analysis of LLMs for Open-Ended Responses

Baral, Sami; Worden, Eamon; Lim, Wen-Chiang; Luo, Zhuang; Santorelli, Christopher; Gurung, Ashish

doi:10.5281/zenodo.12729932

Citation Details

Automated Assessment in Math Education: A Comparative Analysis of LLMs for Open-Ended Responses

The effectiveness of feedback in enhancing learning outcomes is well documented within Educational Data Mining (EDM). Various prior research have explored methodologies to enhance the effectiveness of feedback to students in various ways. Recent developments in Large Language Models (LLMs) have extended their utility in enhancing automated feedback systems. This study aims to explore the potential of LLMs in facilitating automated feedback in math education in the form of numeric assessment scores. We examine the effectiveness of LLMs in evaluating student responses and scoring the responses by comparing 3 different models: Llama, SBERT-Canberra, and GPT4 model. The evaluation requires the model to provide a quantitative score on the student's responses to open-ended math problems. We employ Mistral, a version of Llama catered to math, and fine-tune this model for evaluating student responses by leveraging a dataset of student responses and teacher-provided scores for middle-school math problems. A similar approach was taken for training the SBERT-Canberra model, while the GPT4 model used a zero-shot learning approach. We evaluate and compare the models' performance in scoring accuracy. This study aims to further the ongoing development of automated assessment and feedback systems and outline potential future directions for leveraging generative LLMs in building automated feedback systems. more »

Award ID(s):: 1931523

PAR ID:: 10541415

Author(s) / Creator(s):: Baral, Sami; Worden, Eamon; Lim, Wen-Chiang; Luo, Zhuang; Santorelli, Christopher; Gurung, Ashish

Editor(s):: Benjamin, Paaßen; Carrie, Demmans Epp

Publisher / Repository:: International Educational Data Mining Society

Date Published:: 2024-01-01

Format(s):: Medium: X

Right(s):: Creative Commons Attribution 4.0 International

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript
Conference Paper:
https://doi.org/10.5281/zenodo.12729932

More Like this