- Award ID(s):
- 1955394
- NSF-PAR ID:
- 10309311
- Date Published:
- Journal Name:
- CHI Conference on Human Factors in Computing Systems (CHI '21)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Deep Learning (DL) techniques are increasingly being incorporated in critical software systems today. DL software is buggy too. Recent work in SE has characterized these bugs, studied fix patterns, and proposed detection and localization strategies. In this work, we introduce a preventative measure. We propose design by contract for DL libraries, DL Contract for short, to document the properties of DL libraries and provide developers with a mechanism to identify bugs during development. While DL Contract builds on the traditional design by contract techniques, we need to address unique challenges. In particular, we need to document properties of the training process that are not visible at the functional interface of the DL libraries. To solve these problems, we have introduced mechanisms that allow developers to specify properties of the model architecture, data, and training process. We have designed and implemented DL Contract for Python-based DL libraries and used it to document the properties of Keras, a well-known DL library. We evaluate DL Contract in terms of effectiveness, runtime overhead, and usability. To evaluate the utility of DL Contract, we have developed 15 sample contracts specifically for training problems and structural bugs. We have adopted four well-vetted benchmarks from prior works on DL bug detection and repair. For the effectiveness, DL Contract correctly detects 259 bugs in 272 real-world buggy programs, from well-vetted benchmarks provided in prior work on DL bug detection and repair. We found that the DL Contract overhead is fairly minimal for the used benchmarks. Lastly, to evaluate the usability, we conducted a survey of twenty participants who have used DL Contract to find and fix bugs. The results reveal that DL Contract can be very helpful to DL application developers when debugging their code.more » « less
-
Efficiency is essential to support responsiveness w.r.t. ever-growing datasets, especially for Deep Learning (DL) systems. DL frameworks have traditionally embraced deferred execution-style DL code that supports symbolic, graph-based Deep Neural Network (DNN) computation. While scalable, such development tends to produce DL code that is error-prone, non-intuitive, and difficult to debug. Consequently, more natural, less error-prone imperative DL frameworks encouraging eager execution have emerged but at the expense of run-time performance. While hybrid approaches aim for the "best of both worlds," the challenges in applying them in the real world are largely unknown. We conduct a data-driven analysis of challenges---and resultant bugs---involved in writing reliable yet performant imperative DL code by studying 250 open-source projects, consisting of 19.7 MLOC, along with 470 and 446 manually examined code patches and bug reports, respectively. The results indicate that hybridization: (i) is prone to API misuse, (ii) can result in performance degradation---the opposite of its intention, and (iii) has limited application due to execution mode incompatibility. We put forth several recommendations, best practices, and anti-patterns for effectively hybridizing imperative DL code, potentially benefiting DL practitioners, API designers, tool developers, and educators.more » « less
-
{"Abstract":["more » « less
Efficiency is essential to support responsiveness w.r.t. ever-growing datasets, especially for Deep Learning (DL) systems. DL frameworks have traditionally embraced deferred execution-style DL code that supports symbolic, graph-based Deep Neural Network (DNN) computation. While scalable, such development tends to produce DL code that is error-prone, non-intuitive, and difficult to debug. Consequently, more natural, less error-prone imperative DL frameworks encouraging eager execution have emerged but at the expense of run-time performance. While hybrid approaches aim for the "best of both worlds," the challenges in applying them in the real world are largely unknown. We conduct a data-driven analysis of challenges\u2014and resultant bugs\u2014involved in writing reliable yet performant imperative DL code by studying 250 open-source projects, consisting of 19.7 MLOC, along with 470 and 446 manually examined code patches and bug reports, respectively. The results indicate that hybridization: (i) is prone to API misuse, (ii) can result in performance degradation\u2014the opposite of its intention, and (iii) has limited application due to execution mode incompatibility. We put forth several recommendations, best practices, and anti-patterns for effectively hybridizing imperative DL code, potentially benefiting DL practitioners, API designers, tool developers, and educators.<\/p>"],"Other":["Support for this project was provided by PSC-CUNY Award #638010051, jointly funded by The Professional Staff Congress and The City University of New York."]}
-
There is an increasing need for knowledgeable K-12 computer science (CS) teachers. It is necessary to inform teachers how to debug and help their students debug programs. Research has shown that debugging is difficult for novices because the process requires different skills from creating programs and instructing students how to debug can help them acquire these skills. To this end, we developed a CS professional development for middle grade teachers (grades 5th-8th/ages 10-13) that includes lessons on debugging. The teachers completed debugging activities that involved finding bugs in Scratch programs and explaining how they would help their students in debugging. We qualitatively analyzed their responses and found that teachers successfully identified the problem but they struggled to locate it in the code. In considering how they would help students who had such a bug, the teachers often focused on helping the student find a solution for the bug rather than on identifying the problem or its source. Finally, teachers’ ability to identify bugs and the pedagogical strategies to engage students in this process differed based on CS teaching experience and prior CS knowledge. This work contributes to our understanding of teachers’ debugging abilities and advances our knowledge on how to support teachers in teaching their students how to debug their programs.more » « less
-
Much attention has focused on designing tools and activities that support learners in designing fully finished and functional applications such as games, robots, or e-textiles to be shared with others. But helping students learn to debug their applications often takes on a surprisingly more instructionist stance by giving them checklists, teaching them strategies or providing them with test programs. The idea of designing bugs for learning—or debugging by design—makes learners again agents of their own learning and, more importantly, of making and solving mistakes. In this paper, we report on our first implementation of “debugging by design” activities in a classroom of 25 high school students over a period of eight hours as part of a longer e-textiles unit. Here students were asked to craft buggy circuits and code for their peers to solve. In this paper we introduce the design of the debugging by design unit and, drawing on observations and interviews with students and the teacher, address the following research questions: (1) What did students gain from designing and solving bugs for others? (2) How did this experience shape students’ completion of the e-textiles unit? In the discussion, we address how debugging by design contributes to students’ learning of debugging skills.more » « less