NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Realizing Visual Question Answering for Education: GPT-4V as a Multimodal AI

https://doi.org/10.1007/s11528-024-01035-z

Lee, Gyeonggeon; Zhai, Xiaoming (March 2025, TechTrends)

Free, publicly-accessible full text available March 1, 2026
Privacy-Preserved Automated Scoring Using Federated Learning for Educational Research

https://doi.org/10.1007/978-3-031-99267-4_17

Latif, Ehsan; Zhai, Xiaoming (January 2025, Springer Nature Switzerland)

Full Text Available
Unveiling Scoring Processes: Dissecting the Differences Between LLMs and Human Graders in Automatic Scoring

https://doi.org/10.1007/s10758-025-09836-8

Wu, Xuansheng; Saraf, Padmaja Pravin; Lee, Gyeonggeon; Latif, Ehsan; Liu, Ninghao; Zhai, Xiaoming (March 2025, Technology, Knowledge and Learning)

Free, publicly-accessible full text available March 21, 2026
Transforming Teachers’ Roles and Agencies in the Era of Generative AI: Perceptions, Acceptance, Knowledge, and Practices

https://doi.org/10.1007/s10956-024-10174-0

Zhai, Xiaoming (November 2024, Journal of Science Education and Technology)

Full Text Available
Efficient Multi-task Inferencing: Model Merging with Gromov-Wasserstein Feature Alignment

https://doi.org/10.1007/978-3-031-99267-4_24

Fang, Luyang; Latif, Ehsan; Lu, Haoran; Zhou, Yifan; Ma, Ping; Zhai, Xiaoming (January 2025, Springer Nature Switzerland)

Full Text Available
Multimodality of AI for Education: Toward Artificial General Intelligence

https://doi.org/10.1109/TLT.2025.3574466

Lee, Gyeonggeon; Shi, Lehong; Latif, Ehsan; Gao, Yizhu; Bewersdorff, Arne; Nyaaba, Matthew; Guo, Shuchen; Liu, Zhengliang; Mai, Gengchen; Liu, Tianming; et al (January 2025, IEEE Transactions on Learning Technologies)

Full Text Available
Knowledge Distillation of LLMs for Automatic Scoring of Science Assessments

Latif, Ehsan; Fang, Luyang; Ma, Ping; Zhai, Xiaoming (June 2024, International Conference on Artificial Intelligence in Education)

Full Text Available
Knowledge Distillation of LLMs for Automatic Scoring of Science Assessments

https://doi.org/10.1007/978-3-031-64312-5_20

Latif, Ehsan; Fang, Luyang; Ma, Ping; Zhai, Xiaoming (January 2024, Springer Nature Switzerland)

Full Text Available
Examining the Effect of Assessment Construct Characteristics on Machine Learning Scoring of Scientific Argumentation

https://doi.org/10.1007/s40593-023-00385-8

Haudek, Kevin_C; Zhai, Xiaoming (December 2023, International Journal of Artificial Intelligence in Education)

Abstract Argumentation, a key scientific practice presented in theFramework for K-12 Science Education, requires students to construct and critique arguments, but timely evaluation of arguments in large-scale classrooms is challenging. Recent work has shown the potential of automated scoring systems for open response assessments, leveraging machine learning (ML) and artificial intelligence (AI) to aid the scoring of written arguments in complex assessments. Moreover, research has amplified that the features (i.e., complexity, diversity, and structure) of assessment construct are critical to ML scoring accuracy, yet how the assessment construct may be associated with machine scoring accuracy remains unknown. This study investigated how the features associated with the assessment construct of a scientific argumentation assessment item affected machine scoring performance. Specifically, we conceptualized the construct in three dimensions: complexity, diversity, and structure. We employed human experts to code characteristics of the assessment tasks and score middle school student responses to 17 argumentation tasks aligned to three levels of a validated learning progression of scientific argumentation. We randomly selected 361 responses to use as training sets to build machine-learning scoring models for each item. The scoring models yielded a range of agreements with human consensus scores, measured by Cohen’s kappa (mean = 0.60; range 0.38 − 0.89), indicating good to almost perfect performance. We found that higher levels ofComplexityandDiversity of the assessment task were associated with decreased model performance, similarly the relationship between levels ofStructureand model performance showed a somewhat negative linear trend. These findings highlight the importance of considering these construct characteristics when developing ML models for scoring assessments, particularly for higher complexity items and multidimensional assessments.
more » « less
Editorial: Machine learning applications in educational studies

https://doi.org/10.3389/feduc.2023.1225802

Zhai, Xiaoming; Lu, Min (June 2023, Frontiers in Education)

Full Text Available

« Prev Next »

Search for: All records