skip to main content


Title: The Human Factors Impact of Programming Error Messages
The impacts of many human factors on how people program are poorly understood and present significant challenges for work on improving programmer productivity and effective techniques for teaching and learning programming. Programming error messages are one factor that is particularly problematic, with a documented history of evidence dating back over 50 years. Such messages, commonly called compiler error messages, present difficulties for programmers with diverse demographic backgrounds. It is generally agreed that these messages could be more effective for all users, making this an obvious and high-impact area to target for improving programming outcomes. This report documents the program and the outputs of Dagstuhl Seminar 22052, “The Human Factors Impact of Programming Error Messages”, which explores this problem. In total, 11 on-site participants and 17 remote participants engaged in intensive collaboration during the seminar, including discussing past and current research, identifying gaps, and developing ways to move forward collaboratively to address these challenges.  more » « less
Award ID(s):
2121993
NSF-PAR ID:
10446714
Author(s) / Creator(s):
Date Published:
Journal Name:
Dagstuhl reports
Volume:
12
Issue:
1
ISSN:
2192-5283
Page Range / eLocation ID:
119-130
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The report documents the program and outcomes of Dagstuhl Seminar 18061 "Evidence About Programmers for Programming Language Design". The seminar brought together a diverse group of researchers from the fields of computer science education, programming languages, software engineering, human-computer interaction, and data science. At the seminar, participants discussed methods for designing and evaluating programming languages that take the needs of programmers directly into account. The seminar included foundational talks to introduce the breadth of perspectives that were represented among the participants; then, groups formed to develop research agendas for several subtopics, including novice programmers, cognitive load, language features, and love of programming languages. The seminar concluded with a discussion of the current SIGPLAN artifact evaluation mechanism and the need for evidence standards in empirical studies of programming languages. 
    more » « less
  2. Type inference is an important part of functional programming languages and has been increasingly adopted to imperative programming. However, providing effective error messages in response to type inference failures (due to type errors in programs) continues to be a challenge. Type error messages generated by compilers and existing error debugging approaches often point to bogus error locations or lack sufficient information for removing the type error, making error debugging ineffective. Counter-factual typing (CFT) addressed this problem by generating comprehensive error messages with each message includes a rich set of information. However, CFT has a large response time, making it too slow for interactive use. In particular, our recent study shows that programmers usually have to go through multiple iterations of updating and recompiling programs to remove a type error. Interestingly, our study also reveals that program updates are minor in each iteration during type error debugging. We exploit this fact and develop eCFT, an efficient version of CFT, which doesn't recompute all error fixes from scratch for each updated program but only recomputes error fixes that are changed in response to the update. Our key observation is that minor program changes lead to minor error suggestion changes. eCFT is based on principal typing, a typing scheme more amenable to reuse previous typing results. We have evaluated our approach and found it is about 12.4× faster than CFT in updating error fixes. 
    more » « less
  3. Compilers primarily give feedback about problems to developers through the use of error messages. Unfortunately, developers routinely find these messages to be confusing and unhelpful. In this paper, we postulate that because error messages present poor explanations, theories of explanation---such as Toulmin's model of argument---can be applied to improve their quality. To understand how compilers should present explanations to developers, we conducted a comparative evaluation with 68 professional software developers and an empirical study of compiler error messages found in Stack Overflow questions across seven different programming languages. Our findings suggest that, given a pair of error messages, developers significantly prefer the error message that employs proper argument structure over a deficient argument structure when neither offers a resolution---but will accept a deficient argument structure if it provides a resolution to the problem. Human-authored explanations on Stack Overflow converge to one of the three argument structures: those that provide a resolution to the error, simple arguments, and extended arguments that provide additional evidence for the problem. Finally, we contribute three practical design principles to inform the design and evaluation of compiler error messages. 
    more » « less
  4. Training deep neural networks can generate non-descriptive error messages or produce unusual output without any explicit errors at all. While experts rely on tacit knowledge to apply debugging strategies, non-experts lack the experience required to interpret model output and correct Deep Learning (DL) programs. In this work, we identify DL debugging heuristics and strategies used by experts, andIn this work, we categorize the types of errors novices run into when writing ML code, and map them onto opportunities where tools could help. We use them to guide the design of Umlaut. Umlaut checks DL program structure and model behavior against these heuristics; provides human-readable error messages to users; and annotates erroneous model output to facilitate error correction. Umlaut links code, model output, and tutorial-driven error messages in a single interface. We evaluated Umlaut in a study with 15 participants to determine its effectiveness in helping developers find and fix errors in their DL programs. Participants using Umlaut found and fixed significantly more bugs and were able to implement fixes for more bugs compared to a baseline condition. 
    more » « less
  5. This report documents the program and the outcomes of Dagstuhl Seminar "EU Cyber Resilience Act: Socio-Technical and Research Challenges" (24112). This timely seminar brought together experts in computer science, tech policy, and economics, as well as industry stakeholders, national agencies, and regulators to identify new research challenges posed by the EU Cyber Resilience Act (CRA), a new EU regulation that aims to set essential cybersecurity requirements for digital products to be permissible in the EU market. The seminar focused on analyzing the proposed text and standards for identifying obstacles in standardization, developer practices, user awareness, and software analysis methods for easing adoption, certification, and enforcement. Seminar participants noted the complexity of designing meaningful cybersecurity regulations and of aligning regulatory requirements with technological advancements, market trends, and vendor incentives, referencing past challenges with GDPR and COPPA adoption and compliance. The seminar also emphasized the importance of regulators, marketplaces, and both mobile and IoT platforms in eliminating malicious and deceptive actors from the market, and promoting transparent security practices from vendors and their software supply chain. The seminar showed the need for multi-disciplinary and collaborative efforts to support the CRA’s successful implementation and enhance cybersecurity across the EU. 
    more » « less