skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Stumbling Our Way Through Finding a Better Prompt: Using GPT-4 to Analyze Engineering Faculty’s Mental Models of Assessment
In this full research paper, we discuss the benefits and challenges of using GPT-4 to perform qualitative analysis to identify faculty’s mental models of assessment. Assessments play an important role in engineering education. They are used to evaluate student learning, measure progress, and identify areas for improvement. However, how faculty members approach assessments can vary based on several factors, including their own mental models of assessment. To understand the variation in these mental models, we conducted interviews with faculty members in various engineering disciplines at universities across the United States. Data was collected from 28 participants from 18 different universities. The interviews consisted of questions designed to elicit information related to the pieces of mental models (state, form, function, and purpose) of assessments of students in their classrooms. For this paper, we analyzed interviews to identify the entities and entity relationships in participant statements using natural language processing and GPT-4 as our language model. We then created a graphical representation to characterize and compare individuals’ mental models of assessment using GraphViz. We asked the model to extract entities and their relationships from interview excerpts, using GPT-4 and instructional prompts. We then compared the results of GPT-4 from a small portion of our data to entities and relationships that were extracted manually by one of our researchers. We found that both methods identified overlapping entity relationships but also discovered entities and relationships not identified by the other model. The GPT-4 model tended to identify more basic relationships, while manual analysis identified more nuanced relationships. Our results do not currently support using GPT-4 to automatically generate graphical representations of faculty’s mental models of assessments. However, using a human-in-the-loop process could help offset GPT-4’s limitations. In this paper, we will discuss plans for our future work to improve upon GPT-4’s current performance.  more » « less
Award ID(s):
2113631
PAR ID:
10525802
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
American Society for Engineering Education Annual Conference and Exposition
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this full research paper, we discuss the benefits and challenges of using GPT-4 to perform qualitative analysis to identify faculty’s mental models of assessment. Assessments play an important role in engineering education. They are used to evaluate student learning, measure progress, and identify areas for improvement. However, how faculty members approach assessments can vary based on several factors, including their own mental models of assessment. To understand the variation in these mental models, we conducted interviews with faculty members in various engineering disciplines at universities across the United States. Data was collected from 28 participants from 18 different universities. The interviews consisted of questions designed to elicit information related to the pieces of mental models (state, form, function, and purpose) of assessments of students in their classrooms. For this paper, we analyzed interviews to identify the entities and entity relationships in participant statements using natural language processing and GPT-4 as our language model. We then created a graphical representation to characterize and compare individuals’ mental models of assessment using GraphViz. We asked the model to extract entities and their relationships from interview excerpts, using GPT-4 and instructional prompts. We then compared the results of GPT-4 from a small portion of our data to entities and relationships that were extracted manually by one of our researchers. We found that both methods identified overlapping entity relationships but also discovered entities and relationships not identified by the other model. The GPT-4 model tended to identify more basic relationships, while manual analysis identified more nuanced relationships. Our results do not currently support using GPT-4 to automatically generate graphical representations of faculty’s mental models of assessments. However, using a human-in-the-loop process could help offset GPT-4’s limitations. In this paper, we will discuss plans for our future work to improve upon GPT-4’s current performance. 
    more » « less
  2. This full research paper documents assessment definitions from engineering faculty members, mainly from Research 1 universities. Assessments are essential components of the engineering learning environment, and how engineering faculty make decisions about assessments in their classroom is a relatively understudied topic in engineering education research. Exploring how engineering faculty think and implement assessments through the mental model framework can help address this research gap. The research documented in this paper focuses on analyzing data from an informational questionnaire that is part of a larger study to understand how the participants define assessments through methods inspired by mixed method strategies. These strategies include descriptive statistics on demographic findings and Natural Language Processing (NLP) and coding on the open-ended response question asking the participants to define assessments, which yielded cluster themes that characterize the definitions. Findings show that while many participants defined assessments in relation to measuring student learning, other substantial aspects include benchmarking, assessing student ability and competence, and formal evaluation for quality. These findings serve as foundational knowledge toward deeper exploration and understanding of assessment mental models of engineering faculty that can begin to address the aforementioned research gap on faculty assessment decisions in classrooms. 
    more » « less
  3. This Work-in-Progress paper studies the mental models of engineering faculty regarding assessment, focusing on their use of metaphors. Assessments are crucial components in courses as they serve various purposes in the learning and teaching process, such as gauging student learning, evaluating instructors and course design, and documenting learning for accountability. Thus, when it comes to faculty development on teaching, assessments should consistently be considered while discussing pedagogical improvements. To contribute to faculty development research, our study illuminates several metaphors engineering faculty use to discuss assessment concepts and knowledge. This paper helps to answer the research question: which metaphors do faculty use when talking about assessment in their classrooms? Through interviews grounded in mental model theory, six metaphors emerged: (1) cooking, (2) playing golf, (3) driving a car, (4) coaching football, (5) blood tests, (6) and generically playing a sport or an instrument. Two important takeaways stemmed from the analysis. First, these metaphors were experiences commonly portrayed in the culture in which the study took place. This is important to note for someone working in faculty development as these metaphors may create communication challenges. Second, the mental model approach showed potential in eliciting ways engineering faculty describe and discuss assessments, offering opportunities for future research and practice in faculty development. The lightning talk will present further details on the findings. 
    more » « less
  4. Industry-funded research poses a threat to the validity of scientific inference on carcinogenic hazards. Scientists require tools to better identify and characterize industry sponsored research across bodies of evidence to reduce the possible influence of industry bias in evidence synthesis reviews. We applied a novel large language model (LLM)-based tool named InfluenceMapper to demonstrate and evaluate its performance in identifying relationships to industry in research on the carcinogenicity of benzene, cobalt, and aspartame. MethodsAll epidemiological, animal cancer, and mechanistic studies included in systematic reviews on the carcinogenicity of the three agents by theIARC Monographsprogramme. Selected agents were recently evaluated by theMonographsand are of commercial interest by major industries. InfluenceMapper extracted disclosed entities in study publications and classified up to 40 possible disclosed relationship types between each entity and the study and between each entity and author. A human classified entities as ‘industry or industry-funded’ and assessed relationships with industry for potential conflicts of interest. Positive predictive values described the extent of true positive relationships identified by InfluenceMapper compared to human assessment. ResultsAnalyses included 2,046 studies for all three agents. We identified 320 disclosed industry or industry-funded entities from InfluenceMapper output that were involved in 770 distinct study-entity and author-entity relationships. For each agent, between 4 and 8% of studies disclosed funding by industry and 1–4% of studies had at least one author who disclosed receiving industry funding directly. Industry trade associations for all three agents funded 22 studies published in 16 journals over a 37-year span. Aside from funding, the most prevalent disclosed relationships with industry were receiving data, holding employment, paid consulting, and providing expert testimony. Positive predictive values were excellent (≥ 98%) for study-entity relationships but declined for relationships with individual authors. ConclusionsLLM-based tools can significantly expedite and bolster the detection of disclosed conflicts of interest from industry sponsored research in cancer prevention. Possible use cases include facilitating the assessment of bias from industry studies in evidence synthesis reviews and alerting scientists to the influence of industry on scientific inference. Persistent challenges in ascertaining conflicts of interest underscore the urgent need for standardized, transparent, and enforceable disclosures in biomedical journals. 
    more » « less
  5. Calzolari, Nicoletta; Kan, Min-Yen; Hoste, Veronique; Lenci, Alessandro; Sakti, Sakriani; Xue, Nianwen (Ed.)
    In this work, we revisit the problem of semi-supervised named entity recognition (NER) focusing on extremely light supervision, consisting of a lexicon containing only 10 examples per class. We introduce ELLEN, a simple, fully modular, neuro-symbolic method that blends fine-tuned language models with linguistic rules. These rules include insights such as ''One Sense Per Discourse'', using a Masked Language Model as an unsupervised NER, leveraging part-of-speech tags to identify and eliminate unlabeled entities as false negatives, and other intuitions about classifier confidence scores in local and global context. ELLEN achieves very strong performance on the CoNLL-2003 dataset when using the minimal supervision from the lexicon above. It also outperforms most existing (and considerably more complex) semi-supervised NER methods under the same supervision settings commonly used in the literature (i.e., 5% of the training data). Further, we evaluate our CoNLL-2003 model in a zero-shot scenario on WNUT-17 where we find that it outperforms GPT-3.5 and achieves comparable performance to GPT-4. In a zero-shot setting, ELLEN also achieves over 75% of the performance of a strong, fully supervised model trained on gold data. Our code is available at: https://github.com/hriaz17/ELLEN 
    more » « less