Stumbling Our Way Through Finding a Better Prompt: Using GPT-4 to Analyze Engineering Faculty’s Mental Models of Assessment

Ross, A; Katz, A; Chew, KJ; Matusovich, HM

In this full research paper, we discuss the benefits and challenges of using GPT-4 to perform qualitative analysis to identify faculty’s mental models of assessment. Assessments play an important role in engineering education. They are used to evaluate student learning, measure progress, and identify areas for improvement. However, how faculty members approach assessments can vary based on several factors, including their own mental models of assessment. To understand the variation in these mental models, we conducted interviews with faculty members in various engineering disciplines at universities across the United States. Data was collected from 28 participants from 18 different universities. The interviews consisted of questions designed to elicit information related to the pieces of mental models (state, form, function, and purpose) of assessments of students in their classrooms. For this paper, we analyzed interviews to identify the entities and entity relationships in participant statements using natural language processing and GPT-4 as our language model. We then created a graphical representation to characterize and compare individuals’ mental models of assessment using GraphViz. We asked the model to extract entities and their relationships from interview excerpts, using GPT-4 and instructional prompts. We then compared the results of GPT-4 from a small portion of our data to entities and relationships that were extracted manually by one of our researchers. We found that both methods identified overlapping entity relationships but also discovered entities and relationships not identified by the other model. The GPT-4 model tended to identify more basic relationships, while manual analysis identified more nuanced relationships. Our results do not currently support using GPT-4 to automatically generate graphical representations of faculty’s mental models of assessments. However, using a human-in-the-loop process could help offset GPT-4’s limitations. In this paper, we will discuss plans for our future work to improve upon GPT-4’s current performance.

More Like this