Software testing is a critical skill for computing students, but learning and practicing testing can be challenging, particularly for beginners. A recent study suggests that a lightweight testing checklist that contains testing strategies and tutorial information could assist students in writing quality tests. However, students expressed a desire for more support in knowing how to test the code/scenario. Moreover, the potential costs and benefits of the testing checklist are not yet examined in a classroom setting. To that end, we improved the checklist by integrating explicit testing strategies to it (ETS Checklist), which provide step-by-step guidance on how to transfer semantic information from instructions to the possible testing scenarios. In this paper, we report our experiences in designing explicit strategies in unit testing, as well as adapting the ETS Checklist as optional tool support in a CS1.5 course. With the quantitative and qualitative analysis of the survey responses and lab assignment submissions generated by students, we discuss students' engagement with the ETS Checklists. Our results suggest that students who used the checklist intervention had significantly higher quality in their student-authored test code, in terms of code coverage, compared to those who did not, especially for assignments earlier in the course. We also observed students' unawareness of their need for help in writing high-quality tests.
more »
« less
Check It Off: Exploring the Impact of a Checklist Intervention on the Quality of Student-authored Unit Tests
Software testing is an essential skill for computer science students. Prior work reports that students desire support in determining what code to test and which scenarios should be tested. In response to this, we present a lightweight testing checklist that contains both tutorial information and testing strategies to guide students in what and how to test. To assess the impact of the testing checklist, we conducted an experimental, controlled A/B study with 32 undergraduate and graduate students. The study task was writing a test suite for an existing program. Students were given either the testing checklist (the experimental group) or a tutorial on a standard coverage tool with which they were already familiar (the control group). By analyzing the combination of student-written tests and survey responses, we found students with the checklist performed as well as or better than the coverage tool group, suggesting a potential positive impact of the checklist (or at minimum, a non-negative impact). This is particularly noteworthy given the control condition of the coverage tool is the state of the practice. These findings suggest that the testing tool support does not need to be sophisticated to be effective.
more »
« less
- PAR ID:
- 10432813
- Date Published:
- Journal Name:
- Proceedings of the 27th ACM Conference on on Innovation and Technology in Computer Science Education
- Volume:
- 1
- Page Range / eLocation ID:
- 276 to 282
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Although measuring held-out accuracy has been the primary approach to evaluate generalization, it often overestimates the performance of NLP models, while alternative approaches for evaluating models either focus on individual tasks or on specific behaviors. Inspired by principles of behavioral testing in software engineering, we introduce CheckList, a task-agnostic methodology for testing NLP models. CheckList includes a matrix of general linguistic capabilities and test types that facilitate comprehensive test ideation, as well as a software tool to generate a large and diverse number of test cases quickly. We illustrate the utility of CheckList with tests for three tasks, identifying critical failures in both commercial and state-of-art models. In a user study, a team responsible for a commercial sentiment analysis model found new and actionable bugs in an extensively tested model. In another user study, NLP practitioners with CheckList created twice as many tests, and found almost three times as many bugs as users without it.more » « less
-
Abstract Mastering the concept of distributed forces is vital for students who are pursuing a major involving engineering mechanics. Misconceptions related to distributed forces that are typically acquired in introductory Physics courses should be corrected to increase student success in subsequent mechanics coursework. The goal of this study was to develop and assess a guided instructional activity using augmented reality (AR) technology to improve undergraduate engineering students' understanding of distributed forces. The AR app was accompanied by a complementary activity to guide and challenge students to model objects as beams with progressively increasing difficulty. The AR tool allowed students to (a) model a tabletop as a beam with multiple distributed forces, (b) visualize the free body diagram, and (c) compute the external support reactions. To assess the effectiveness of the activity, 43 students were allocated to control and treatment groups using an experimental nonequivalent groups preactivity/postactivity test design. Of the 43 students, 35 participated in their respective activity. Students in the control group collaborated on traditional problem‐solving, while those in the treatment group engaged in a guided activity using AR. Students' knowledge of distributed forces was measured using their scores on a 10‐item test instrument. Analysis of covariance was utilized to analyze postactivity test scores by controlling for the preactivity test scores. The treatment group demonstrated a significantly greater improvement in postactivity test scores than that of the control group. The measured effect size was 0.13, indicating that 13% of the total variance in the postactivity test scores can be attributed to the activity. Though the effect size was small, the results suggest that a guided AR activity can be more effective in improving student learning outcomes than traditional problem‐solving.more » « less
-
When instructors want to design programming assignments to motivate their students, a common design choice is to have those students write code to make an artifact (e.g. apps, games, music, or images). The goal of this study is to understand the impacts of including artifact creation in a programming assignment on students’ motivation, time on task, and cognitive load. To do so, we conducted a controlled lab study with seventy-three students from an introductory engineering course. The experimental group created a simulation they could interact with – thus having the full experience of artifact creation – while the control group wrote the exact same code, but evaluated it only with test cases. We hypothesized that students who could interact with the simulation they were programming would be more motivated to complete the assignment and report higher intrinsic motivation. However, we found no significant difference in motivation or cognitive load between the groups. Additionally, the experimental group spent more time completing the assignment than the control group. Our results suggest that artifact creation may not be necessary for motivating students in all contexts, and that artifact creation may have other effects such as increased time on task. Additionally, instructors and researchers should consider when, and in what contexts, artifact creation is beneficial and when it may not bemore » « less
-
null (Ed.)The feedback provided by current testing education tools about the deficiencies in a student’s test suite either mimics industry code coverage tools or lists specific instructor test cases that are missing from the student’s test suite. While useful in some sense, these types of feedback are akin to revealing the solution to the problem, which can inadvertently encourage students to pursue a trial-and-error approach to testing, rather than using a more systematic approach that encourages learning. In addition to not teaching students why their test suite is inadequate, this type of feedback may motivate students to become dependent on the feedback rather than thinking for themselves. To address this deficiency, there is an opportunity to investigate alternative feedback mechanisms that include a positive reinforcement of testing concepts. We argue that using an inquiry-based learning approach is better than simply providing the answers. To facilitate this type of learning, we present Testing Tutor, a web-based assignment submission platform that supports different levels of testing pedagogy via a customizable feedback engine. We evaluated the impact of the different types of feedback through an empirical study in two sophomore-level courses.We use Testing Tutor to provide students with different types of feedback, either traditional detailed code coverage feedback or inquiry-based learning conceptual feedback, and compare the effects. The results show that students that receive conceptual feedback had higher code coverage (by different measures), fewer redundant test cases, and higher programming grades than the students who receive traditional code coverage feedback.more » « less
An official website of the United States government

