NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Grading explanations of problem-solving process and generating feedback using large language models at human-level accuracy

https://doi.org/10.1103/PhysRevPhysEducRes.21.010126

Chen, Zhongzhou; Wan, Tong (March 2025, Physical Review Physics Education Research)

This study examines the feasibility and potential advantages of using large language models, in particular GPT-4o, to perform partial credit grading of large numbers of student written responses to introductory level physics problems. Students were instructed to write down verbal explanations of their reasoning process when solving one conceptual and two numerical calculation problems on two exams. The explanations were then graded according to a three-item rubric with each item graded as binary (1 or 0). We first demonstrate that machine grading using GPT-4o with no examples or reference answers can reliably agree with human graders in 70%–80% of all cases, which is equal to or higher than the level at which two human graders agree with each other. Two methods are essential for achieving this level of accuracy: (i) Adding explanation language to each rubric item that targets the errors of initial machine grading. (ii) Running the grading process 5 times and taking the most frequent outcome. Next, we show that the variation in outcomes across five machine grading attempts can serve as a grading confidence index. The index allows a human expert to identify $\sim 40 %$ of all potentially incorrect gradings by reviewing just 10%–15% of all responses with the highest variation. Finally, we show that it is straightforward to use GPT-4o to write a clear and detailed explanation of the partial credit grading outcome. Those explanations can be used as feedback for students, which will allow students to understand their grades and raise different opinions when necessary. Almost all feedback messages generated were rated three or above on a five-point scale by two instructors who had taught the course multiple times. The entire grading and feedback generating process costs roughly $5 per 100 student answers, which shows immense promise for automating labor-intensive grading process through a combination of machine grading with human input and supervision. Published by the American Physical Society2025
more » « less
Free, publicly-accessible full text available March 1, 2026
Achieving Human Level Partial Credit Grading of Written Responses to Physics Conceptual Question using GPT-3.5 with Only Prompt Engineering

https://doi.org/10.1119/perc.2024.pr.Chen

Chen, Zhongzhou; Wan, Tong (September 2024, American Association of Physics Teachers)

Full Text Available
Principal Spectral Theory of Time-Periodic Nonlocal Dispersal Cooperative Systems and Applications

https://doi.org/10.1137/22M1543902

Feng, Yan-Xia; Li, Wan-Tong; Ruan, Shigui; Xin, Ming-Zhen (June 2024, SIAM Journal on Mathematical Analysis)

Full Text Available
Physics Graduate Teaching Assistant Use of Error Framing in Recitations and Laboratories

https://doi.org/10.1119/perc.2023.pr.Sharkey

Sharkey, Daniel; Doty, Constance M.; Wan, Tong; Saitta, Erin K.; Chini, Jacquelyn J. (October 2023, Proceedings of the Physics Education Research Conference)
Dyan Jones, Qing X. (Ed.)
Despite the positive gains towards student learning outcomes and engagement, active learning has been shown to potentially increase student anxiety due to a fear of negative evaluation. A pedagogical strategy proposed to mediate this issue is known as error framing; it asks instructors to encourage a perception of errors as being a natural part of the learning process. Previous work on this project investigated how graduate teaching assistants (GTAs) operationalized error framing during their training in a mixed-reality simulator but did not investigate their usage of it in their classrooms. This analysis characterizes the error framing statements made by GTAs during a set of classroom observations. We find that GTAs who employ error framing effectively avoid statements that might decrease student comfort and instead tend towards implicit, indirect strategies.
more » « less
Full Text Available
Impact of high-intensity training with a mixed-reality simulator on graduate teaching assistants use of questioning

https://doi.org/10.1103/PhysRevPhysEducRes.19.020101

Doty, Constance M.; Geraets, Ashley A.; Wan, Tong; Nix, Christopher A.; Saitta, Erin K.; Chini, Jacquelyn J. (July 2023, Physical Review Physics Education Research)

Full Text Available
Dynamics and asymptotic profiles of a nonlocal dispersal SIS epidemic model with bilinear incidence and Neumann boundary conditions

https://doi.org/10.1016/j.jde.2022.07.003

Feng, Yan-Xia; Li, Wan-Tong; Ruan, Shigui; Yang, Fei-Ying (October 2022, Journal of Differential Equations)

Full Text Available
Responding to incorrect ideas: science graduate teaching assistants’ operationalization of error framing and undergraduate students’ perception

https://doi.org/10.1186/s40594-023-00398-8

Wan, Tong; Doty, Constance M.; Geraets, Ashley A.; Saitta, Erin K. H.; Chini, Jacquelyn J. (January 2023, International Journal of STEM Education)

Abstract BackgroundIn college science laboratory and discussion sections, student-centered active learning strategies have been implemented to improve student learning outcomes and experiences. Research has shown that active learning activities can increase student anxiety if students fear that they could be negatively evaluated by their peers. Error framing (i.e., to frame errors as natural and beneficial to learning) is proposed in the literature as a pedagogical tool to reduce student anxiety. However, little research empirically explores how an instructor can operationalize error framing and how error framing is perceived by undergraduate students. To bridge the gap in the literature, we conducted a two-stage study that involved science graduate teaching assistants (GTAs) and undergraduate students. In stage one, we introduced cold calling (i.e., calling on non-volunteering students) and error framing to 12 chemistry and 11 physics GTAs. Cold calling can increase student participation but may increase student anxiety. Error framing has the potential to mitigate student anxiety when paired with cold calling. GTAs were then tasked to rehearse cold calling paired with error framing in a mixed-reality classroom simulator. We identified GTA statements that aligned with the definition of error framing. In stage two, we selected a few example GTA error framing statements and interviewed 13 undergraduate students about their perception of those statements. ResultsIn the simulator, all the GTAs rehearsed cold calling multiple times while only a few GTAs made error framing statements. A thematic analysis of GTAs’ error framing statements identified ways of error indication (i.e., explicit and implicit) and framing (i.e., natural, beneficial, and positive acknowledgement). Undergraduate student interviews revealed specific framing and tone that are perceived as increasing or decreasing student comfort in participating in classroom discourse. Both undergraduate students and some GTAs expressed negative opinions toward responses that explicitly indicate student mistakes. Undergraduate students’ perspectives also suggest that error framing should be implemented differently depending on whether errors have already occurred. ConclusionError framing is challenging for science GTAs to implement. GTAs’ operationalizations of error framing in the simulator and undergraduate students’ perceptions contribute to defining and operationalizing error framing for instructional practice. To increase undergraduate student comfort in science classroom discourse, GTAs can use implicit error indication. In response to students’ incorrect answers, GTAs can positively frame students’ specific ideas rather than discussing broadly how errors are natural or beneficial.
more » « less
Curved Fronts of Bistable Reaction-Diffusion Equations in Spatially Periodic Media

https://doi.org/10.1007/s00205-021-01711-x

Guo, Hongjun; Li, Wan-Tong; Liu, Rongsong; Wang, Zhi-Cheng (December 2021, Archive for Rational Mechanics and Analysis)

Full Text Available
Spatial propagation in nonlocal dispersal Fisher-KPP equations

https://doi.org/10.1016/j.jfa.2021.108957

Xu, Wen-Bing; Li, Wan-Tong; Ruan, Shigui (May 2021, Journal of Functional Analysis)
null (Ed.)
Full Text Available
Preparing GTAs for Active Learning in the General Chemistry Lab: Development of an Evidence-Based Rehearsal Module for a Mixed-Reality Teaching Simulator

https://doi.org/10.1007/s10956-021-09923-2

Geraets, Ashley A.; Nottolini, Isadore L.; Doty, Constance M.; Wan, Tong; Chini, Jacquelyn J.; Saitta, Erin K. (December 2021, Journal of Science Education and Technology)

Full Text Available

« Prev Next »

Search for: All records