Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Prompting LLMs for complex tasks (e.g., building a trip advisor chatbot) needs humans to clearly articulate customized requirements (e.g., “start the response with a tl;dr”). However, existing prompt engineering instructions often lack focused training on requirement articulation and instead tend to emphasize increasingly automatable strategies (e.g., tricks like adding role-plays and “think step-by-step”). To address the gap, we introduce Requirement-Oriented Prompt Engineering (ROPE), a paradigm that focuses human attention on generating clear, complete requirements during prompting. We implement ROPE through an assessment and training suite that provides deliberate practice with LLM-generated feedback. In a randomized controlled experiment with 30 novices, ROPE significantly outperforms conventional prompt engineering training (20% vs. 1% gains), a gap that automatic prompt optimization cannot close. Furthermore, we demonstrate a direct correlation between the quality of input requirements and LLM outputs. Our work paves the way to empower more end-users to build complex LLM applications.more » « lessFree, publicly-accessible full text available April 24, 2026
-
Parsons problems are drag-and-drop computer programming puzzles that require learners to place code blocks in the correct order and sometimes indentation. Introductory computer programming instructors use them to teach novice programmers how to code while optimizing problem-solving efficiency and cognitive load. While there is research on the design of Parsons problems for programmers without disabilities and programmers with visual or motor impairments, research regarding their accessibility for programmers with cognitive disabilities is scant. To identify the accessibility barriers and benefits of Parsons problems for neurodiverse programmers, an exploratory multiple-case study was conducted. Participants were asked to read eight chapters of an interactive eBook on Python and to solve Parsons problems. Within-case analyses of 15 retrospective think-aloud interviews with five novice programmers with disabilities led to four recommendations for improving the cognitive accessibility of Parsons problems. For example, programmers with seizure disorders may experience seizures when solving programming problems that require numeric calculations. Hence, creating a range of Parsons problems that do not require mental arithmetic could improve the learning experience for programmers with seizure disorders and those who struggle with mental calculations by lowering their cognitive load. Given this study's qualitative and exploratory approach, it does not offer conclusive, broadly generalizable results. Yet, it reveals detailed and promising avenues for exploration in computing education research that might elude many quantitative techniques.more » « less
-
Large Language Models (LLMs) now excel at generative skills and can create content at impeccable speeds. However, they are imperfect and still make various mistakes. In a Computer Science education context, as these models are widely recognized as “AI pair programmers,” it becomes increasingly important to train students on evaluating and debugging the LLM-generated code. In this work, we introduce HypoCompass, a novel system to facilitate deliberate practice on debugging, where human novices play the role of Teaching Assistants and help LLM-powered teachable agents debug code. We enable effective task delegation between students and LLMs in this learning-by-teaching environment: students focus on hypothesizing the cause of code errors, while adjacent skills like code completion are offloaded to LLM-agents. Our evaluations demonstrate that HypoCompass generates high-quality training materials (e.g., bugs and fixes), outperforming human counterparts fourfold in efficiency, and significantly improves student performance on debugging by 12% in the pre-to-post test.more » « less
An official website of the United States government
