skip to main content


Title: Hao Fayin: Developing Automated Audio Assessment Tools for a Chinese Language Course
We present and evaluate a machine learning based system that automatically grades audios of students speaking a foreign language. The use of automated systems to aid the assessment of student performance holds great promise in augmenting the teacher’s ability to provide meaningful feedback and instruction to students. Teachers spend a significant amount of time grading student work and the use of these tools can save teachers a significant amount of time on their grading. This additional time could be used to give personalized attention to each student. Significant prior research has focused on the grading of closed-form problems, open-ended essays and textual content. However, little research has focused on audio content that is much more prevalent in the language-study education. In this paper, we explore the development of automated assessment tools for audio responses in a college-level Chinese language-learning course. We analyze several challenges faced while working with data of this type as well as the generation and extraction of features for the purpose of building machine learning models to aid in the assessment of student language learning.  more » « less
Award ID(s):
1822830
PAR ID:
10108555
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
The Twelfth International Conference on Educational Data Mining
Page Range / eLocation ID:
663-666
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We present and evaluate a machine learning based system that automatically grades audios of students speaking a foreign language. The use of automated systems to aid the assessment of student performance holds great promise in augmenting the teacher’s ability to provide meaningful feedback and instruction to students. Teachers spend a significant amount of time grading student work and the use of these tools can save teachers a significant amount of time on their grading. This additional time could be used to give personalized attention to each student. Significant prior research has focused on the grading of closed-form problems, open-ended essays and textual content. However, little research has focused on audio content that is much more prevalent in language study education. In this paper, we explore the development of automated assessment tools for audio responses in a college-level Chinese language-learning course. We analyze several challenges faced while working with data of this type as well as the generation and extraction of features for the purpose of building machine learning models to aid in the assessment of student language learning. 
    more » « less
  2. Assessment in the context of foreign language learning can be difficult and time-consuming for instructors. Distinctive from other domains, language learning often requires teachers to assess each student’s ability to speak the language, making this process even more time-consuming in large classrooms which are particularly common in post-secondary settings; considering that language instructors often assess students through assignments requiring recorded audio, a lack of tools to support such teachers makes providing individual feedback even more challenging. In this work, we seek to explore the development of tools to automatically assess audio responses within a college-level Chinese language-learning course. We build a model designed to grade student audio assignments with the purpose of incorporating such a model into tools focused on helping both teachers and students in real classrooms. Building upon our prior work which explored features extracted from audio, the goal of this work is to explore additional features derived from tone and speech recognition models to help assess students on two outcomes commonly observed in language learning classes: fluency and accuracy of speech. In addition to the exploration of features, this work explores the application of Siamese deep learning models for this assessment task. We find that models utilizing tonal features exhibit higher predictive performance of student fluency while text-based features derived from speech recognition models exhibit higher predictive performance of student accuracy of speech. 
    more » « less
  3. Martin Fred ; Norouzi, Narges ; Rosenthal, Stephanie (Ed.)
    This paper examines the use of LLMs to support the grading and explanation of short-answer formative assessments in K12 science topics. While significant work has been done on programmatically scoring well-structured student assessments in math and computer science, many of these approaches produce a numerical score and stop short of providing teachers and students with explanations for the assigned scores. In this paper, we investigate few-shot, in-context learning with chain-of-thought reasoning and active learning using GPT-4 for automated assessment of students’ answers in a middle school Earth Science curriculum. Our findings from this human-in-the-loop approach demonstrate success in scoring formative assessment responses and in providing meaningful explanations for the assigned score. We then perform a systematic analysis of the advantages and limitations of our approach. This research provides insight into how we can use human-in-the-loop methods for the continual improvement of automated grading for open-ended science assessments. 
    more » « less
  4. Abstract Background

    Teachers often rely on the use of open‐ended questions to assess students' conceptual understanding of assigned content. Particularly in the context of mathematics; teachers use these types of questions to gain insight into the processes and strategies adopted by students in solving mathematical problems beyond what is possible through more close‐ended problem types. While these types of problems are valuable to teachers, the variation in student responses to these questions makes it difficult, and time‐consuming, to evaluate and provide directed feedback. It is a well‐studied concept that feedback, both in terms of a numeric score but more importantly in the form of teacher‐authored comments, can help guide students as to how to improve, leading to increased learning. It is for this reason that teachers need better support not only for assessing students' work but also in providing meaningful and directed feedback to students.

    Objectives

    In this paper, we seek to develop, evaluate, and examine machine learning models that support automated open response assessment and feedback.

    Methods

    We build upon the prior research in the automatic assessment of student responses to open‐ended problems and introduce a novel approach that leverages student log data combined with machine learning and natural language processing methods. Utilizing sentence‐level semantic representations of student responses to open‐ended questions, we propose a collaborative filtering‐based approach to both predict student scores as well as recommend appropriate feedback messages for teachers to send to their students.

    Results and Conclusion

    We find that our method outperforms previously published benchmarks across three different metrics for the task of predicting student performance. Through an error analysis, we identify several areas where future works may be able to improve upon our approach.

     
    more » « less
  5. Advancements in online learning platforms have revolutionized education in multiple different ways, transforming the learning experiences and instructional practices. The development of natural language processing and machine learning methods have helped understand and process student languages, comprehend their learning state, and build automated supports for teachers. With this, there has been a growing body of research in developing automated methods to assess students’ work both in mathematical and nonmathematical domains. These automated methods address questions of two categories; closed-ended (with limited correct answers) and open-ended (are often subjective and have multiple correct answers), where open-ended questions are mostly used by teachers to learn about their student’s understanding of a particular concept. Manually assessing and providing feedback to these open-ended questions is often arduous and time-consuming for teachers. For this reason, there have been several works to understand student responses to these open-ended questions to automate the assessment and provide constructive feedback to students. In this research, we seek to improve such a prior method for assessment and feedback suggestions for student open-ended works in mathematics. For this, we present an error analysis of the prior method ”SBERT-Canberra” for auto-scoring, explore various factors that contribute to the error of the method, and propose solutions to improve upon the method by addressing these error factors. We further intend to expand this approach by improving feedback suggestions for teachers to give to their students’ open-ended work. 
    more » « less