Leveraging ASR and LLMs for Automated Scoring and Feedback in Children's Spoken Language Assessments
This paper explores the use of automatic speech recognition (ASR) and large language models (LLMs) for automated scoring and feedback generation in spoken language assessment. We design a three stage pipeline that (1) optimizes ASR hypotheses from student speech, (2) performs task-based scoring using LLMs, and (3) generates natural language feedback justifying each score. We evaluate this pipeline using audio responses from 3rd-8th grade students in the Atlanta, Georgia area, recorded as part of the Test of Narrative Language. Our results show that LLMs can reliably replicate expert annotations while providing interpretable feedback. We further analyze model performance across demographic factors, including dialect and reading proficiency, to assess equity. Our findings demonstrate the promise of ASR and LLMs for robust, explainable, and fair assessment of children’s spoken narratives.
more »
« less
An official website of the United States government

