Large Language Model (LLM) conversational agents are increasingly used in programming education, yet we still lack insight into how novices engage with them for conceptual learning compared with human tutoring. This mixed‑methods study compared learning outcomes and interaction strategies of novices using ChatGPT or human tutors. A controlled lab study with 20 students enrolled in introductory programming courses revealed that students employ markedly different interaction strategies with AI versus human tutors: ChatGPT users relied on brief, zero‑shot prompts and received lengthy, context‑rich responses but showed minimal prompt refinement, while those working with human tutors provided more contextual information and received targeted explanations. Although students distrusted ChatGPT’s accuracy, they paradoxically preferred it for basic conceptual questions due to reduced social anxiety. We offer empirically grounded recommendations for developing AI literacy in computer science education and designing learning‑focused conversational agents that balance trust‑building with maintaining the social safety that facilitates uninhibited inquiry.
more »
« less
This content will become publicly available on October 21, 2026
Understanding Programming Students' Help-Seeking Preferences in the Era of Generative AI
Novice programming students frequently engage in help-seeking to find information and learn about programming concepts. Among the available resources, generative AI (GenAI) chatbots appear resourceful, widely accessible, and less intimidating than human tutors. Programming instructors are actively integrating these tools into classrooms. However, our understanding of how novice programming students trust GenAI chatbots-and the factors influencing their usage-remains limited. To address this gap, we investigated the learning resource selection process of 20 novice programming students tasked with studying a programming topic. We split our participants into two groups: one using ChatGPT (n=10) and the other using a human tutor via Discord (n=10). We found that participants held strong positive perceptions of ChatGPT's speed and convenience but were wary of its inconsistent accuracy, making them reluctant to rely on it for learning entirely new topics. Accordingly, they generally preferred more trustworthy resources for learning (e.g., instructors, tutors), preferring ChatGPT for low-stakes situations or more introductory and common topics. We conclude by offering guidance to instructors on integrating LLM-based chatbots into their curricula-emphasizing verification and situational use-and to developers on designing chatbots that better address novices' trust and reliability concerns.
more »
« less
- PAR ID:
- 10649530
- Publisher / Repository:
- ACM
- Date Published:
- Page Range / eLocation ID:
- 15 to 21
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract As generative artificial intelligence (GenAI) tools such as ChatGPT become more capable and accessible, their use in educational settings is likely to grow. However, the academic community lacks a comprehensive understanding of the perceptions and attitudes of students and instructors toward these new tools. In the Fall 2023 semester, we surveyed 982 students and 76 faculty at a large public university in the United States, focusing on topics such as perceived ease of use, ethical concerns, the impact of GenAI on learning, and differences in responses by role, gender, and discipline. We found that students and faculty did not differ significantly in their attitudes toward GenAI in higher education, except regarding ease of use, hedonic motivation, habit, and interest in exploring new technologies. Students and instructors also used GenAI for coursework or teaching at similar rates, although regular use of these tools was still low across both groups. Among students, we found significant differences in attitudes between males in STEM majors and females in non-STEM majors. These findings underscore the importance of considering demographic and disciplinary diversity when developing policies and practices for integrating GenAI in educational contexts, as GenAI may influence learning outcomes differently across various groups of students. This study contributes to the broader understanding of how GenAI can be leveraged in higher education while highlighting potential areas of inequality that need to be addressed as these tools become more widely used.more » « less
-
Introduction: The emergence and widespread adoption of generative AI (GenAI) chatbots such as ChatGPT, and programming assistants such as GitHub Copilot, have radically redefined the landscape of programming education. This calls for replication of studies and reexamination of findings from pre-GenAI CS contexts to understand the impact on students. Objectives: Achievement Goals are well studied in computing education and can be predictive of student interest and exam performance. The objective in this study is to compare findings from prior achievement goal studies in CS1 courses with new CS1 courses that emphasize the use of human-GenAI collaborative coding. Methods: In a CS1 course that integrates GenAI, we use linear regression to explore the relationship between achievement goals and prior experience on student interest, exam performance, and perceptions of GenAI. Results: As with prior findings in traditional CS1 classes, Mastery goals are correlated with interest in computing. Contradicting prior CS1 findings, normative goals are correlated with exam scores. Normative and mastery goals correlate with students’ perceptions of learning with GenAI. Mastery goals weakly correlate with reading and testing code output from GenAI.more » « less
-
Social anxiety (SA) has become increasingly prevalent. Traditional coping strategies often face accessibility challenges. Generative AI (GenAI), known for their knowledgeable and conversational capabilities, are emerging as alternative tools for mental well-being. With the increased integration of GenAI, it is important to examine individuals’ attitudes and trust in GenAI chatbots’ support for SA. Through a mixed-method approach that involved surveys (n = 159) and interviews (n = 17), we found that individuals with severe symptoms tended to trust and embrace GenAI chatbots more readily, valuing their non-judgmental support and perceived emotional comprehension. However, those with milder symptoms prioritized technical reliability. We identified factors influencing trust, such as GenAI chatbots’ ability to generate empathetic responses and its context-sensitive limitations, which were particularly important among individuals with SA. We also discuss the design implications and use of GenAI chatbots in fostering cognitive and emotional trust, with practical and design considerations.more » « less
-
Background Chatbots are being piloted to draft responses to patient questions, but patients’ ability to distinguish between provider and chatbot responses and patients’ trust in chatbots’ functions are not well established. Objective This study aimed to assess the feasibility of using ChatGPT (Chat Generative Pre-trained Transformer) or a similar artificial intelligence–based chatbot for patient-provider communication. Methods A survey study was conducted in January 2023. Ten representative, nonadministrative patient-provider interactions were extracted from the electronic health record. Patients’ questions were entered into ChatGPT with a request for the chatbot to respond using approximately the same word count as the human provider’s response. In the survey, each patient question was followed by a provider- or ChatGPT-generated response. Participants were informed that 5 responses were provider generated and 5 were chatbot generated. Participants were asked—and incentivized financially—to correctly identify the response source. Participants were also asked about their trust in chatbots’ functions in patient-provider communication, using a Likert scale from 1-5. Results A US-representative sample of 430 study participants aged 18 and older were recruited on Prolific, a crowdsourcing platform for academic studies. In all, 426 participants filled out the full survey. After removing participants who spent less than 3 minutes on the survey, 392 respondents remained. Overall, 53.3% (209/392) of respondents analyzed were women, and the average age was 47.1 (range 18-91) years. The correct classification of responses ranged between 49% (192/392) to 85.7% (336/392) for different questions. On average, chatbot responses were identified correctly in 65.5% (1284/1960) of the cases, and human provider responses were identified correctly in 65.1% (1276/1960) of the cases. On average, responses toward patients’ trust in chatbots’ functions were weakly positive (mean Likert score 3.4 out of 5), with lower trust as the health-related complexity of the task in the questions increased. Conclusions ChatGPT responses to patient questions were weakly distinguishable from provider responses. Laypeople appear to trust the use of chatbots to answer lower-risk health questions. It is important to continue studying patient-chatbot interaction as chatbots move from administrative to more clinical roles in health care.more » « less
An official website of the United States government
