Generative AI models, specifically large language models (LLMs), have made strides towards the long-standing goal of text-to-code generation. This progress has invited numerous studies of user interaction. However, less is known about the struggles and strategies of non-experts, for whom each step of the text-to-code problem presents challenges: describing their intent in natural language, evaluating the correctness of generated code, and editing prompts when the generated code is incorrect. This paper presents a large-scale controlled study of how 120 beginning coders across three academic institutions approach writing and editing prompts. A novel experimental design allows us to target specific steps in the text-to-code process and reveals that beginners struggle with writ- ing and editing prompts, even for problems at their skill level and when correctness is automatically determined. Our mixed-methods evaluation provides insight into student processes and perceptions with key implications for non-expert Code LLM use within and outside of education.
more »
« less
Non-Expert Programmers in the Generative AI Future
Generative AI is rapidly transforming the practice of programming. At the same time, our understanding of who writes programs, for what purposes, and how they program, has been evolving. By facilitating natural-language-to-code interactions, large language models for code have the potential to open up programming work to a broader range of workers. While existing work finds productivity benefits for expert programmers, interactions with non-experts are less well-studied. In this paper, we consider the future of programming for non-experts through a controlled study of 67 non-programmers. Our study reveals multiple barriers to effective use of large language models of code for non-experts, including several aspects of technical communication. Comparing our results to a prior study of beginning programmers illuminates the ways in which a traditional introductory programming class does and does not equip students to effectively work with generative AI. Drawing on our empirical findings, we lay out a vision for how to empower non-expert programmers to leverage generative AI for a more equitable future of programming.
more »
« less
- PAR ID:
- 10525385
- Publisher / Repository:
- ACM
- Date Published:
- ISBN:
- 9798400710179
- Page Range / eLocation ID:
- 1 to 19
- Format(s):
- Medium: X
- Location:
- Newcastle upon Tyne, United Kingdom
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Expertise in programming traditionally assumes a binary novice-expert divide. Learning resources typically target programmers who are learning programming for the first time, or expert programmers for that language. An underrepresented, yet important group of programmers are those that are experienced in one programming language, but desire to author code in a different language. For this scenario, we postulate that an effective form of feedback is presented as a transfer from concepts in the first language to the second. Current programming environments do not support this form of feedback. In this study, we apply the theory of learning transfer to teach a language that programmers are less familiar with-such as R-in terms of a programming language they already know-such as Python. We investigate learning transfer using a new tool called Transfer Tutor that presents explanations for R code in terms of the equivalent Python code. Our study found that participants leveraged learning transfer as a cognitive strategy, even when unprompted. Participants found Transfer Tutor to be useful across a number of affordances like stepping through and highlighting facts that may have been missed or misunderstood. However, participants were reluctant to accept facts without code execution or sometimes had difficulty reading explanations that are verbose or complex. These results provide guidance for future designs and research directions that can support learning transfer when learning new programming languages.more » « less
-
Code LLMs have the potential to make it easier for non-experts to understand and write code. However, current CodeLLM benchmarks rely on a single expert-written prompt per problem, making it hard to generalize their success to non-expert users. In this paper, we present a new natural-language-to-code benchmark of prompts written by a key population of non-experts: beginning programmers. StudentEval contains 1,749 prompts written by 80 students who have only completed one introductory Python course. StudentEval contains numerous non-expert prompts describing the same problem, enabling exploration of key factors in prompt success. We use StudentEval to evaluate 12 Code LLMs and find that StudentEval is a better discriminator of model performance than existing benchmarks. Our analysis of student prompting strategies reveals that nondeterministic LLM sampling can mislead students about the quality of their descriptions, a finding with key implications for Code LLMs in education.more » « less
-
Programming tools are increasingly integral to research and analysis in myriad domains, including specialized areas with no formal relation to computer science. Embedded domain-specific languages (eDSLs) have the potential to serve these programmers while placing relatively light implementation burdens on language designers. However, barriers to eDSL use reduce their practical value and adoption. In this paper, we aim to deepen our understanding of how programmers use eDSLs and identify user needs to inform future eDSL designs. We performed a contextual inquiry (9 participants) with domain experts using Mimi, an eDSL for climate change economics modeling. A thematic analysis identified five key themes, including: the interaction between the eDSL and the host language has significant and sometimes unexpected impacts on eDSL user experience, and users preferentially engage with domain-specific communities and code templates rather than host language resources. The needs uncovered in our study offer design considerations for future eDSLs and suggest directions for future DSL usability research.more » « less
-
Powered by recent advances in code-generating models, AI assistants like Github Copilot promise to change the face of programming forever. But whatisthis new face of programming? We present the first grounded theory analysis of how programmers interact with Copilot, based on observing 20 participants—with a range of prior experience using the assistant—as they solve diverse programming tasks across four languages. Our main finding is that interactions with programming assistants arebimodal: inacceleration mode, the programmer knows what to do next and uses Copilot to get there faster; inexploration mode, the programmer is unsure how to proceed and uses Copilot to explore their options. Based on our theory, we provide recommendations for improving the usability of future AI programming assistants.more » « less
An official website of the United States government

