skip to main content


This content will become publicly available on June 29, 2024

Title: An Experience Report on Introducing Explicit Strategies into Testing Checklists for Advanced Beginners
Software testing is a critical skill for computing students, but learning and practicing testing can be challenging, particularly for beginners. A recent study suggests that a lightweight testing checklist that contains testing strategies and tutorial information could assist students in writing quality tests. However, students expressed a desire for more support in knowing how to test the code/scenario. Moreover, the potential costs and benefits of the testing checklist are not yet examined in a classroom setting. To that end, we improved the checklist by integrating explicit testing strategies to it (ETS Checklist), which provide step-by-step guidance on how to transfer semantic information from instructions to the possible testing scenarios. In this paper, we report our experiences in designing explicit strategies in unit testing, as well as adapting the ETS Checklist as optional tool support in a CS1.5 course. With the quantitative and qualitative analysis of the survey responses and lab assignment submissions generated by students, we discuss students' engagement with the ETS Checklists. Our results suggest that students who used the checklist intervention had significantly higher quality in their student-authored test code, in terms of code coverage, compared to those who did not, especially for assignments earlier in the course. We also observed students' unawareness of their need for help in writing high-quality tests.  more » « less
Award ID(s):
2141923 1749936
NSF-PAR ID:
10432810
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education
Volume:
1
Page Range / eLocation ID:
194 to 200
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Software testing is an essential skill for computer science students. Prior work reports that students desire support in determining what code to test and which scenarios should be tested. In response to this, we present a lightweight testing checklist that contains both tutorial information and testing strategies to guide students in what and how to test. To assess the impact of the testing checklist, we conducted an experimental, controlled A/B study with 32 undergraduate and graduate students. The study task was writing a test suite for an existing program. Students were given either the testing checklist (the experimental group) or a tutorial on a standard coverage tool with which they were already familiar (the control group). By analyzing the combination of student-written tests and survey responses, we found students with the checklist performed as well as or better than the coverage tool group, suggesting a potential positive impact of the checklist (or at minimum, a non-negative impact). This is particularly noteworthy given the control condition of the coverage tool is the state of the practice. These findings suggest that the testing tool support does not need to be sophisticated to be effective. 
    more » « less
  2. Flaky tests are a source of frustration and uncertainty for developers. In an educational environment, flaky tests can create doubts related to software behavior and student grades, especially when the grades depend on tests passing. NC State University's junior-level software engineering course models industrial practice through team-based development and testing of new features on a large electronic health record (EHR) system, iTrust2. Students are expected to maintain and supplement an extensive suite of UI tests using Selenium WebDriver. Team builds are run on the course's continuous integration (CI) infrastructure. Students report, and we confirm, that tests that pass on one build will inexplicably fail on the next, impacting productivity and confidence in code quality and the CI system. The goal of this work is to find and fix the sources of flaky tests in iTrust2. We analyze configurations of Selenium using different underlying web browsers and timeout strategies (waits) for both test stability and runtime performance. We also consider underlying hardware and operating systems. Our results show that HtmlUnit with Thread waits provides the lowest number of test failures and best runtime on poor-performing hardware. When given more resources (e.g., more memory and a faster CPU), Google Chrome with Angular waits is less flaky and faster than HtmlUnit, especially if the browser instance is not restarted between tests. The outcomes of this research are a more stable and substantially faster teaching application and a recommendation on how to configure Selenium for applications similar to iTrust2 that run in a CI environment. 
    more » « less
  3. When asked about how they deal with unforeseen problems, novice learners often describe a process of “trial and error.” This process might fairly be described as iteration, a critical step in the design process, but falls short of the practices that engineering education needs to develop. In the face of novel and multifaceted problems, future engineers must be comfortable and competent not just trying again, but identifying failure points, troubleshooting, and running systematic tests with relevant data. To examine the abilities of novice designers to test and effectively refine ideas and prototypes, we conducted qualitative analysis of structured interviews, audio, video, and designs of 11 girls, ages 9 -11, working on computational papercrafts as part of a museum-based STEAM summer camp. The projects involved design and construction of expressive paper and cardboard sculptures with gears and linkages powered by servomotors. Over the course of one day, the girls generated designs inspired by a camp theme, then had to work with mechanics, electronics and craft to create working versions that would be displayed as part of a public exhibit. Computational papercraft was selected because it lowers cost and intimidation. Our design conjecture was that by making materials familiar and abundant, learners would have more relevant knowledge, could easily modify and replicate components, and would therefore be better able to recognize potential faults and more likely to engage in testing and refinement. We also supported design and troubleshooting with a customized circuit board and an online gear simulator. In the first stage of this study, we looked at what engineering practices emerged, given these conditions. We asked: What opportunities for testing and refinement did computational papercrafts open up? What resources and tools do young learners employ when testing and refining designs? Analysis showed that technical supports for testing and refinement were successful in supporting valued testing and refinement practices as youth pursued personal goals. Use of the simulator and customized microcontroller allowed for consideration of multiple alternatives and for “trial before error.” Learners were able to conduct focused tests on subsystems of their paper machines, and to make “small bets,” keeping initial ideas and designs fluid. Inexpensive materials also allowed them to test and refine at late project stages, without feeling that they were wasting time or materials. The analysis sheds light on young students practices of testing and refinement, and how to best support young people as they begin learning trajectories in engineering. The approach is especially relevant within making-oriented engineering education and other settings working to broaden participation in engineering. 
    more » « less
  4. In this paper, we explore using Parsons problems to scaffold novice programmers who are struggling while solving write-code problems. Parsons problems, in which students put mixed-up code blocks in order, can be created quickly and already serve thousands of students while other types of programming support methods are expensive to develop or do not scale. We conducted two studies in which novices were given equivalent Parsons problems as optional scaffolding while solving write-code problems. We investigated when, why, and how students used the Parsons problems as well as their perceptions of the benefits and challenges. A think-aloud observational study with 11 undergraduate students showed that students utilized the Parsons problem before writing a solution to get ideas about where to start; during writing a solution when they were stuck; and after writing a solution to debug errors and look for better strategies. Semi-structured interviews with the same 11 undergraduate students provided evidence that using Parsons problems to scaffold write-code problems helped students to reduce the difficulty, reduce the problem completion time, learn problem-solving strategies, and refine their programming knowledge. However, some students found them less useful if the Parsons solution did not match their approach or if they did not understand the solution. We then conducted a between-subjects classroom study with 81 undergraduate students to investigate the effects on learning. We found that students who received Parsons problems as scaffolding during write-code problems spent significantly less time solving those problems. However, there was no significant learning gain in either condition from pretest to posttest. We also discuss the design implications of our findings. 
    more » « less
  5. The emphasis on conceptual learning and the development of adaptive instructional design are both emerging areas in science and engineering education. Instructors are writing their own conceptual questions to promote active learning during class and utilizing pools of these questions in assessments. For adaptive assessment strategies, these questions need to be rated based on difficulty level (DL). Historically DL has been determined from the performance of a suitable number of students. The research study reported here investigates whether instructors can save time by predicting DL of newly made conceptual questions without the need for student data. In this paper, we report on the development of one component in an adaptive learning module for materials science – specifically on the topic of crystallography. The summative assessment element consists of five DL scales and 15 conceptual questions This adaptive assessment directs students based on their previous performances and the DL of the questions. Our five expert participants are faculty members who have taught the introductory Materials Science course multiple times. They provided predictions for how many students would answer each question correctly during a two-step process. First, predictions were made individually without an answer key. Second, experts had the opportunity to revise their predictions after being provided an answer key in a group discussion. We compared expert predictions with actual student performance using results from over 400 students spanning multiple courses and terms. We found no clear correlation between expert predictions of the DL and the measured DL from students. Some evidence shows that discussion during the second step made expert predictions closer to student performance. We suggest that, in determining the DL for conceptual questions, using predictions of the DL by experts who have taught the course is not a valid route. The findings in this paper can be applied to assessments in both in-person, hybrid, and online settings and is applicable to subject matter beyond materials science. 
    more » « less