In this full research paper, we discuss the benefits and challenges of using GPT-4 to perform qualitative analysis to identify faculty’s mental models of assessment. Assessments play an important role in engineering education. They are used to evaluate student learning, measure progress, and identify areas for improvement. However, how faculty members approach assessments can vary based on several factors, including their own mental models of assessment. To understand the variation in these mental models, we conducted interviews with faculty members in various engineering disciplines at universities across the United States. Data was collected from 28 participants from 18 different universities. The interviews consisted of questions designed to elicit information related to the pieces of mental models (state, form, function, and purpose) of assessments of students in their classrooms. For this paper, we analyzed interviews to identify the entities and entity relationships in participant statements using natural language processing and GPT-4 as our language model. We then created a graphical representation to characterize and compare individuals’ mental models of assessment using GraphViz. We asked the model to extract entities and their relationships from interview excerpts, using GPT-4 and instructional prompts. We then compared the results of GPT-4 from a small portion of our data to entities and relationships that were extracted manually by one of our researchers. We found that both methods identified overlapping entity relationships but also discovered entities and relationships not identified by the other model. The GPT-4 model tended to identify more basic relationships, while manual analysis identified more nuanced relationships. Our results do not currently support using GPT-4 to automatically generate graphical representations of faculty’s mental models of assessments. However, using a human-in-the-loop process could help offset GPT-4’s limitations. In this paper, we will discuss plans for our future work to improve upon GPT-4’s current performance. 
                        more » 
                        « less   
                    
                            
                            Stumbling Our Way Through Finding a Better Prompt: Using GPT-4 to Analyze Engineering Faculty’s Mental Models of Assessment
                        
                    
    
            In this full research paper, we discuss the benefits and challenges of using GPT-4 to perform qualitative analysis to identify faculty’s mental models of assessment. Assessments play an important role in engineering education. They are used to evaluate student learning, measure progress, and identify areas for improvement. However, how faculty members approach assessments can vary based on several factors, including their own mental models of assessment. To understand the variation in these mental models, we conducted interviews with faculty members in various engineering disciplines at universities across the United States. Data was collected from 28 participants from 18 different universities. The interviews consisted of questions designed to elicit information related to the pieces of mental models (state, form, function, and purpose) of assessments of students in their classrooms. For this paper, we analyzed interviews to identify the entities and entity relationships in participant statements using natural language processing and GPT-4 as our language model. We then created a graphical representation to characterize and compare individuals’ mental models of assessment using GraphViz. We asked the model to extract entities and their relationships from interview excerpts, using GPT-4 and instructional prompts. We then compared the results of GPT-4 from a small portion of our data to entities and relationships that were extracted manually by one of our researchers. We found that both methods identified overlapping entity relationships but also discovered entities and relationships not identified by the other model. The GPT-4 model tended to identify more basic relationships, while manual analysis identified more nuanced relationships. Our results do not currently support using GPT-4 to automatically generate graphical representations of faculty’s mental models of assessments. However, using a human-in-the-loop process could help offset GPT-4’s limitations. In this paper, we will discuss plans for our future work to improve upon GPT-4’s current performance. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2113631
- PAR ID:
- 10520470
- Publisher / Repository:
- American Society for Engineering Education Annual Conference and Exposition
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            This full research paper documents assessment definitions from engineering faculty members, mainly from Research 1 universities. Assessments are essential components of the engineering learning environment, and how engineering faculty make decisions about assessments in their classroom is a relatively understudied topic in engineering education research. Exploring how engineering faculty think and implement assessments through the mental model framework can help address this research gap. The research documented in this paper focuses on analyzing data from an informational questionnaire that is part of a larger study to understand how the participants define assessments through methods inspired by mixed method strategies. These strategies include descriptive statistics on demographic findings and Natural Language Processing (NLP) and coding on the open-ended response question asking the participants to define assessments, which yielded cluster themes that characterize the definitions. Findings show that while many participants defined assessments in relation to measuring student learning, other substantial aspects include benchmarking, assessing student ability and competence, and formal evaluation for quality. These findings serve as foundational knowledge toward deeper exploration and understanding of assessment mental models of engineering faculty that can begin to address the aforementioned research gap on faculty assessment decisions in classrooms.more » « less
- 
            This Work-in-Progress paper studies the mental models of engineering faculty regarding assessment, focusing on their use of metaphors. Assessments are crucial components in courses as they serve various purposes in the learning and teaching process, such as gauging student learning, evaluating instructors and course design, and documenting learning for accountability. Thus, when it comes to faculty development on teaching, assessments should consistently be considered while discussing pedagogical improvements. To contribute to faculty development research, our study illuminates several metaphors engineering faculty use to discuss assessment concepts and knowledge. This paper helps to answer the research question: which metaphors do faculty use when talking about assessment in their classrooms? Through interviews grounded in mental model theory, six metaphors emerged: (1) cooking, (2) playing golf, (3) driving a car, (4) coaching football, (5) blood tests, (6) and generically playing a sport or an instrument. Two important takeaways stemmed from the analysis. First, these metaphors were experiences commonly portrayed in the culture in which the study took place. This is important to note for someone working in faculty development as these metaphors may create communication challenges. Second, the mental model approach showed potential in eliciting ways engineering faculty describe and discuss assessments, offering opportunities for future research and practice in faculty development. The lightning talk will present further details on the findings.more » « less
- 
            Calzolari, Nicoletta; Kan, Min-Yen; Hoste, Veronique; Lenci, Alessandro; Sakti, Sakriani; Xue, Nianwen (Ed.)In this work, we revisit the problem of semi-supervised named entity recognition (NER) focusing on extremely light supervision, consisting of a lexicon containing only 10 examples per class. We introduce ELLEN, a simple, fully modular, neuro-symbolic method that blends fine-tuned language models with linguistic rules. These rules include insights such as ''One Sense Per Discourse'', using a Masked Language Model as an unsupervised NER, leveraging part-of-speech tags to identify and eliminate unlabeled entities as false negatives, and other intuitions about classifier confidence scores in local and global context. ELLEN achieves very strong performance on the CoNLL-2003 dataset when using the minimal supervision from the lexicon above. It also outperforms most existing (and considerably more complex) semi-supervised NER methods under the same supervision settings commonly used in the literature (i.e., 5% of the training data). Further, we evaluate our CoNLL-2003 model in a zero-shot scenario on WNUT-17 where we find that it outperforms GPT-3.5 and achieves comparable performance to GPT-4. In a zero-shot setting, ELLEN also achieves over 75% of the performance of a strong, fully supervised model trained on gold data. Our code is available at: https://github.com/hriaz17/ELLENmore » « less
- 
            null (Ed.)Due to large number of entities in biomedical knowledge bases, only a small fraction of entities have corresponding labelled training data. This necessitates entity linking models which are able to link mentions of unseen entities using learned representations of entities. Previous approaches link each mention independently, ignoring the relationships within and across documents between the entity mentions. These relations can be very useful for linking mentions in biomedical text where linking decisions are often difficult due mentions having a generic or a highly specialized form. In this paper, we introduce a model in which linking decisions can be made not merely by linking to a knowledge base entity but also by grouping multiple mentions together via clustering and jointly making linking predictions. In experiments on the largest publicly available biomedical dataset, we improve the best independent prediction for entity linking by 3.0 points of accuracy, and our clustering-based inference model further improves entity linking by 2.3 points.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    