skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM ET on Friday, February 6 until 10:00 AM ET on Saturday, February 7 due to maintenance. We apologize for the inconvenience.


Title: Potential pitfalls of false positives
Automated writing evaluation (AWE) systems automatically assess and provide students with feedback on their writing. Despite learning benefits, students may not effectively interpret and utilize AI-generated feedback, thereby not maximizing their learning outcomes. A closely related issue is the accuracy of the systems, that students may not understand, are not perfect. Our study investigates whether students differentially addressed false positive and false negative AI-generated feedback errors on their science essays. We found that students addressed nearly all the false negative feedback; however, they addressed less than one-fourth of the false positive feedback. The odds of addressing a false positive feedback was 99% lower than addressing a false negative feedback, representing significant missed opportunities for revision and learning. We discuss the implications of these findings in the context of students’ learning.  more » « less
Award ID(s):
2010483
PAR ID:
10515210
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
International Conference on Artificial Intelligence in Education 2024
Date Published:
Journal Name:
International Conference on Artificial Intelligence in Education 2024 Proceedings
Subject(s) / Keyword(s):
AI Accuracy, Automated Feedback, Science Writing.
Format(s):
Medium: X
Location:
Recife, Brazil
Sponsoring Org:
National Science Foundation
More Like this
  1. Providing rich, constructive feedback to students is essential for supporting and enhancing their learning. Recent advancements in Generative Artificial Intelligence (AI), particularly with large language models (LLMs), present new opportunities to deliver scalable, repeatable, and instant feedback, effectively making abundant a resource that has historically been scarce and costly. From a technical perspective, this approach is now feasible due to breakthroughs in AI and Natural Language Processing (NLP). While the potential educational benefits are compelling, implementing these technologies also introduces a host of ethical considerations that must be thoughtfully addressed. One of the core advantages of AI systems is their ability to automate routine and mundane tasks, potentially freeing up human educators for more nuanced work. However, the ease of automation risks a “tyranny of the majority”, where the diverse needs of minority or unique learners are overlooked, as they may be harder to systematize and less straightforward to accommodate. Ensuring inclusivity and equity in AI-generated feedback, therefore, becomes a critical aspect of responsible AI implementation in education. The process of developing machine learning models that produce valuable, personalized, and authentic feedback also requires significant input from human domain experts. Decisions around whose expertise is incorporated, how it is captured, and when it is applied have profound implications for the relevance and quality of the resulting feedback. Additionally, the maintenance and continuous refinement of these models are necessary to adapt feedback to evolving contextual, theoretical, and student-related factors. Without ongoing adaptation, feedback risks becoming obsolete or mismatched with the current needs of diverse student populations. Addressing these challenges is essential not only for ethical integrity but also for building the operational trust needed to integrate AI-driven systems as valuable tools in contemporary education. Thoughtful planning and deliberate choices are needed to ensure that these solutions truly benefit all students, allowing AI to support an inclusive and dynamic learning environment. 
    more » « less
  2. Students are often tasked in engaging with activities where they have to learn skills that are tangential to the learning outcomes of a course, such as learning a new software. The issue is that instructors may not have the time or the expertise to help students with such tangential learning. In this paper, we explore how AI-generated feedback can provide assistance. Specifically, we study this technology in the context of a constructionist curriculum where students learn about experimental research through the creation of a gamified experiment. The AI-generated feedback gives a formative assessment on the narrative design of student-designed gamified experiments, which is important to create an engaging experience. We find that students critically engaged with the feedback, but that responses varied among students. We discuss the implications for AI-generated feedback systems for tangential learning. 
    more » « less
  3. This exploratory study focuses on the use of ChatGPT, a generative artificial intelligence (GAI) tool, by undergraduate engineering students in lab report writing in the major. Literature addressing the impact of ChatGPT and AI on student writing suggests that such technologies can both support and limit students' composing and learning processes. Acknowledging the history of writing with technologies and writing as technology, the development of GAI warrants attention to pedagogical and ethical implications in writing-intensive engineering classes. This pilot study investigates how the use of ChatGPT impacts students’ lab writing outcomes in terms of rhetorical knowledge, critical thinking and composing, knowledge of conventions, and writing processes. A group of undergraduate volunteers (n= 7) used ChatGPT to revise their original engineering lab reports written without using ChatGPT. A comparative study was conducted between original lab report samples and revisions by directly assessing students’ lab reports in gateway engineering lab courses. A focus group was conducted to learn their experiences and perspectives on ChatGPT in the context of engineering lab report writing. Implementing ChatGPT in the revision writing process could result in improving engineering students’ lab report quality due to students’ enhanced lab report genre understanding. At the same time, the use of ChatGPT also leads students to provide false claims, incorrect lab procedures, or extremely broad statements, which are not valued in the engineering lab report genre. 
    more » « less
  4. Hoadley, C; Wang, XC (Ed.)
    Helping students learn how to write is essential. However, students have few opportunities to develop this skill, since giving timely feedback is difficult for teachers. AI applications can provide quick feedback on students’ writing. But, ensuring accurate assessment can be challenging, since students’ writing quality can vary. We examined the impact of students’ writing quality on the error rate of our natural language processing (NLP) system when assessing scientific content in initial and revised design essays. We also explored whether aspects of writing quality were linked to the number of NLP errors. Despite finding that students’ revised essays were significantly different from their initial essays in a few ways, our NLP systems’ accuracy was similar. Further, our multiple regression analyses showed, overall, that students’ writing quality did not impact our NLP systems’ accuracy. This is promising in terms of ensuring students with different writing skills get similarly accurate feedback. 
    more » « less
  5. Brankov, Jovan G; Anastasio, Mark A (Ed.)
    Artificial intelligence (AI) tools are designed to improve the efficacy and efficiency of data analysis and interpretation by the human decision maker. However, we know little about the optimal ways to present AI output to providers. This study used radiology image interpretation with AI-based decision support to explore the impact of different forms of AI output on reader performance. Readers included 5 experienced radiologists and 3 radiology residents reporting on a series of COVID chest x-ray images. Four different forms (1 word summarizing diagnoses (normal, mild, moderate, severe), probability graph, heatmap, heatmap plus probability graph) of AI outputs (plus no AI feedback) were evaluated. Results reveal that most decisions regarding presence/absence of COVID without AI were correct and overall remained unchanged across all types of AI outputs. Fewer than 1% of decisions that were changed as a function of seeing the AI output were negative (true positive to false negative or true negative to false positive) regarding presence/absence of COVID; and about 1% were positive (false negative to true positive, false positive to true negative). More complex output formats (e.g., heat map plus a probability graph) tend to increase reading time and the number of scans between the clinical image and the AI outputs as revealed through eyetracking. The key to the success of AI tools in medical imaging will be to incorporate the human into the overall process to optimize and synergize the human-computer dyad, since at least for the foreseeable future, the human is and will be the ultimate decision maker. Our results demonstrate that the form of the AI output is important as it can impact clinical decision making and efficiency. 
    more » « less