Qualitative Coding with GPT-4: Where it Works Better

Liu, Xiner; Zambrano, Andres Felipe; Baker, Ryan S; Barany, Amanda; Ocumpaugh, Jaclyn; Zhang, Jiayi; Pankiewicz, Maciej; Nasiar, Nidhi; Wei, Zhanlan

doi:10.18608/jla.2025.8575

Citation Details

This content will become publicly available on March 27, 2026

Qualitative Coding with GPT-4: Where it Works Better

This study explores the potential of the large language model GPT-4 as an automated tool for qualitative data analysis by educational researchers, exploring which techniques are most successful for different types of constructs. Specifically, we assess three different prompt engineering strategies — Zero-shot, Few-shot, and Few-shot with contextual information — as well as the use of embeddings. We do so in the context of qualitatively coding three distinct educational datasets: Algebra I semi-personalized tutoring session transcripts, student observations in a game-based learning environment, and debugging behaviours in an introductory programming course. We evaluated the performance of each approach based on its inter-rater agreement with human coders and explored how different methods vary in effectiveness depending on a construct’s degree of clarity, concreteness, objectivity, granularity, and specificity. Our findings suggest that while GPT-4 can code a broad range of constructs, no single method consistently outperforms the others, and the selection of a particular method should be tailored to the specific properties of the construct and context being analyzed. We also found that GPT-4 has the most difficulty with the same constructs than human coders find more difficult to reach inter-rater reliability on. more »

Award ID(s):: 2301172

PAR ID:: 10600281

Author(s) / Creator(s):: Liu, Xiner; Zambrano, Andres Felipe; Baker, Ryan S; Barany, Amanda; Ocumpaugh, Jaclyn; Zhang, Jiayi; Pankiewicz, Maciej; Nasiar, Nidhi; Wei, Zhanlan

Publisher / Repository:: Journal of Learning Analytics

Date Published:: 2025-03-27

Journal Name:: Journal of Learning Analytics

Volume:: 12

Issue:: 1

ISSN:: 1929-7750

Page Range / eLocation ID:: 169 to 185

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on March 27, 2026
Journal Article:
https://doi.org/10.18608/jla.2025.8575

More Like this