skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Employing automatic analysis tools aligned to learning progressions to assess knowledge application and support learning in STEM
Abstract We discuss transforming STEM education using three aspects: learning progressions (LPs), constructed response performance assessments, and artificial intelligence (AI). Using LPs to inform instruction, curriculum, and assessment design helps foster students’ ability to apply content and practices to explain phenomena, which reflects deeper science understanding. To measure the progress along these LPs, performance assessments combining elements of disciplinary ideas, crosscutting concepts and practices are needed. However, these tasks are time-consuming and expensive to score and provide feedback for. Artificial intelligence (AI) allows to validate the LPs and evaluate performance assessments for many students quickly and efficiently. The evaluation provides a report describing student progress along LP and the supports needed to attain a higher LP level. We suggest using unsupervised, semi-supervised ML and generative AI (GAI) at early LP validation stages to identify relevant proficiency patterns and start building an LP. We further suggest employing supervised ML and GAI for developing targeted LP-aligned performance assessment for more accurate performance diagnosis at advanced LP validation stages. Finally, we discuss employing AI for designing automatic feedback systems for providing personalized feedback to students and helping teachers implement LP-based learning. We discuss the challenges of realizing these tasks and propose future research avenues.  more » « less
Award ID(s):
2200757
PAR ID:
10554278
Author(s) / Creator(s):
; ;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
International Journal of STEM Education
Volume:
11
Issue:
1
ISSN:
2196-7822
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Involving students in scientific modeling practice is one of the most effective approaches to achieving the next generation science education learning goals. Given the complexity and multirepresentational features of scientific models, scoring student-developed models is time- and cost-intensive, remaining one of the most challenging assessment practices for science education. More importantly, teachers who rely on timely feedback to plan and adjust instruction are reluctant to use modeling tasks because they could not provide timely feedback to learners. This study utilized machine learn- ing (ML), the most advanced artificial intelligence (AI), to develop an approach to automatically score student- drawn models and their written descriptions of those models. We developed six modeling assessment tasks for middle school students that integrate disciplinary core ideas and crosscutting concepts with the modeling practice. For each task, we asked students to draw a model and write a description of that model, which gave students with diverse backgrounds an opportunity to represent their understanding in multiple ways. We then collected student responses to the six tasks and had human experts score a subset of those responses. We used the human-scored student responses to develop ML algorithmic models (AMs) and to train the computer. Validation using new data suggests that the machine-assigned scores achieved robust agreements with human consent scores. Qualitative analysis of student-drawn models further revealed five characteristics that might impact machine scoring accuracy: Alternative expression, confusing label, inconsistent size, inconsistent position, and redundant information. We argue that these five characteristics should be considered when developing machine-scorable modeling tasks. 
    more » « less
  2. Abstract Argumentation, a key scientific practice presented in theFramework for K-12 Science Education, requires students to construct and critique arguments, but timely evaluation of arguments in large-scale classrooms is challenging. Recent work has shown the potential of automated scoring systems for open response assessments, leveraging machine learning (ML) and artificial intelligence (AI) to aid the scoring of written arguments in complex assessments. Moreover, research has amplified that the features (i.e., complexity, diversity, and structure) of assessment construct are critical to ML scoring accuracy, yet how the assessment construct may be associated with machine scoring accuracy remains unknown. This study investigated how the features associated with the assessment construct of a scientific argumentation assessment item affected machine scoring performance. Specifically, we conceptualized the construct in three dimensions: complexity, diversity, and structure. We employed human experts to code characteristics of the assessment tasks and score middle school student responses to 17 argumentation tasks aligned to three levels of a validated learning progression of scientific argumentation. We randomly selected 361 responses to use as training sets to build machine-learning scoring models for each item. The scoring models yielded a range of agreements with human consensus scores, measured by Cohen’s kappa (mean = 0.60; range 0.38 − 0.89), indicating good to almost perfect performance. We found that higher levels ofComplexityandDiversity of the assessment task were associated with decreased model performance, similarly the relationship between levels ofStructureand model performance showed a somewhat negative linear trend. These findings highlight the importance of considering these construct characteristics when developing ML models for scoring assessments, particularly for higher complexity items and multidimensional assessments. 
    more » « less
  3. To address the increasing demand for AI literacy, we introduced a novel active learning approach that leverages both teaching assistants (TAs) and generative AI to provide feedback during in-class exercises. This method was evaluated through two studies in separate Computer Science courses, focusing on the roles and impacts of TAs in this learning environment, as well as their collaboration with ChatGPT in enhancing student feedback. The studies revealed that TAs were effective in accurately determining students’ progress and struggles, particularly in areas such as “backtracking”, where students faced significant challenges. This intervention’s success was evident from high student engagement and satisfaction levels, as reported in an end-of-semester survey. Further findings highlighted that while TAs provided detailed technical assessments and identified conceptual gaps effectively, ChatGPT excelled in presenting clarifying examples and offering motivational support. Despite some TAs’ resistance to fully embracing the feedback guidelines-specifically their reluctance to provide encouragement-the collaborative feedback process between TAs and ChatGPT improved the quality of feedback in several aspects, including technical accuracy and clarity in explaining conceptual issues. These results suggest that integrating human and artificial intelligence in educational settings can significantly enhance traditional teaching methods, creating a more dynamic and responsive learning environment. Future research will aim to improve both the quality and efficiency of feedback, capitalizing on unique strengths of both human and AI to further advance educational practices in the field of computing. 
    more » « less
  4. Abstract In response to Li, Reigh, He, and Miller's commentary,Can we and should we use artificial intelligence for formative assessment in science, we argue that artificial intelligence (AI) is already being widely employed in formative assessment across various educational contexts. While agreeing with Li et al.'s call for further studies on equity issues related to AI, we emphasize the need for science educators to adapt to the AI revolution that has outpaced the research community. We challenge the somewhat restrictive view of formative assessment presented by Li et al., highlighting the significant contributions of AI in providing formative feedback to students, assisting teachers in assessment practices, and aiding in instructional decisions. We contend that AI‐generated scores should not be equated with the entirety of formative assessment practice; no single assessment tool can capture all aspects of student thinking and backgrounds. We address concerns raised by Li et al. regarding AI bias and emphasize the importance of empirical testing and evidence‐based arguments in referring to bias. We assert that AI‐based formative assessment does not necessarily lead to inequity and can, in fact, contribute to more equitable educational experiences. Furthermore, we discuss how AI can facilitate the diversification of representational modalities in assessment practices and highlight the potential benefits of AI in saving teachers’ time and providing them with valuable assessment information. We call for a shift in perspective, from viewing AI as a problem to be solved to recognizing its potential as a collaborative tool in education. We emphasize the need for future research to focus on the effective integration of AI in classrooms, teacher education, and the development of AI systems that can adapt to diverse teaching and learning contexts. We conclude by underlining the importance of addressing AI bias, understanding its implications, and developing guidelines for best practices in AI‐based formative assessment. 
    more » « less
  5. Artificial Intelligence (AI) enhanced systems are widely adopted in post-secondary education, however, tools and activities have only recently become accessible for teaching AI and machine learning (ML) concepts to K-12 students. Research on K-12 AI education has largely included student attitudes toward AI careers, AI ethics, and student use of various existing AI agents such as voice assistants; most of which has focused on high school and middle school. There is no consensus on which AI and Machine Learning concepts are grade-appropriate for elementary-aged students or how elementary students explore and make sense of AI and ML tools. AI is a rapidly evolving technology and as future decision-makers, children will need to be AI literate[1]. In this paper, we will present elementary students’ sense-making of simple machine-learning concepts. Through this project, we hope to generate a new model for introducing AI concepts to elementary students into school curricula and provide tangible, trainable representations of ML for students to explore in the physical world. In our first year, our focus has been on simpler machine learning algorithms. Our desire is to empower students to not only use AI tools but also to understand how they operate. We believe that appropriate activities can help late elementary-aged students develop foundational AI knowledge namely (1) how a robot senses the world, and (2) how a robot represents data for making decisions. Educational robotics programs have been repeatedly shown to result in positive learning impacts and increased interest[2]. In this pilot study, we leveraged the LEGO® Education SPIKE™ Prime for introducing ML concepts to upper elementary students. Through pilot testing in three one-week summer programs, we iteratively developed a limited display interface for supervised learning using the nearest neighbor algorithm. We collected videos to perform a qualitative evaluation. Based on analyzing student behavior and the process of students trained in robotics, we found some students show interest in exploring pre-trained ML models and training new models while building personally relevant robotic creations and developing solutions to engineering tasks. While students were interested in using the ML tools for complex tasks, they seemed to prefer to use block programming or manual motor controls where they felt it was practical. 
    more » « less