skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The Human Factors Impact of Programming Error Messages
The impacts of many human factors on how people program are poorly understood and present significant challenges for work on improving programmer productivity and effective techniques for teaching and learning programming. Programming error messages are one factor that is particularly problematic, with a documented history of evidence dating back over 50 years. Such messages, commonly called compiler error messages, present difficulties for programmers with diverse demographic backgrounds. It is generally agreed that these messages could be more effective for all users, making this an obvious and high-impact area to target for improving programming outcomes. This report documents the program and the outputs of Dagstuhl Seminar 22052, “The Human Factors Impact of Programming Error Messages”, which explores this problem. In total, 11 on-site participants and 17 remote participants engaged in intensive collaboration during the seminar, including discussing past and current research, identifying gaps, and developing ways to move forward collaboratively to address these challenges.  more » « less
Award ID(s):
2121993
PAR ID:
10446714
Author(s) / Creator(s):
Date Published:
Journal Name:
Dagstuhl reports
Volume:
12
Issue:
1
ISSN:
2192-5283
Page Range / eLocation ID:
119-130
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The report documents the program and outcomes of Dagstuhl Seminar 18061 "Evidence About Programmers for Programming Language Design". The seminar brought together a diverse group of researchers from the fields of computer science education, programming languages, software engineering, human-computer interaction, and data science. At the seminar, participants discussed methods for designing and evaluating programming languages that take the needs of programmers directly into account. The seminar included foundational talks to introduce the breadth of perspectives that were represented among the participants; then, groups formed to develop research agendas for several subtopics, including novice programmers, cognitive load, language features, and love of programming languages. The seminar concluded with a discussion of the current SIGPLAN artifact evaluation mechanism and the need for evidence standards in empirical studies of programming languages. 
    more » « less
  2. Type inference is an important part of functional programming languages and has been increasingly adopted to imperative programming. However, providing effective error messages in response to type inference failures (due to type errors in programs) continues to be a challenge. Type error messages generated by compilers and existing error debugging approaches often point to bogus error locations or lack sufficient information for removing the type error, making error debugging ineffective. Counter-factual typing (CFT) addressed this problem by generating comprehensive error messages with each message includes a rich set of information. However, CFT has a large response time, making it too slow for interactive use. In particular, our recent study shows that programmers usually have to go through multiple iterations of updating and recompiling programs to remove a type error. Interestingly, our study also reveals that program updates are minor in each iteration during type error debugging. We exploit this fact and develop eCFT, an efficient version of CFT, which doesn't recompute all error fixes from scratch for each updated program but only recomputes error fixes that are changed in response to the update. Our key observation is that minor program changes lead to minor error suggestion changes. eCFT is based on principal typing, a typing scheme more amenable to reuse previous typing results. We have evaluated our approach and found it is about 12.4× faster than CFT in updating error fixes. 
    more » « less
  3. This report documents the program and the outcomes of Dagstuhl Seminar 23031 Frontiers of Information Access Experimentation for Research and Education, which brought together 38 participants from 12 countries. The seminar addressed technology-enhanced information access (information retrieval, recommender systems, natural language processing) and specifically focused on developing more responsible experimental practices leading to more valid results, both for research as well as for scientific education. The seminar featured a series of long and short talks delivered by participants, who helped in setting a common ground and in letting emerge topics of interest to be explored as the main output of the seminar. This led to the definition of five groups which investigated challenges, opportunities, and next steps in the following areas:reality check, i.e. conducting real-world studies, human-machine-collaborative relevance judgment frameworks, overcoming methodological challenges in information retrieval and recommender systems through awareness and education, results-blind reviewing, and guidance for authors. Date:15--20 January 2023. Website:https://www.dagstuhl.de/23031. 
    more » « less
  4. Compilers primarily give feedback about problems to developers through the use of error messages. Unfortunately, developers routinely find these messages to be confusing and unhelpful. In this paper, we postulate that because error messages present poor explanations, theories of explanation---such as Toulmin's model of argument---can be applied to improve their quality. To understand how compilers should present explanations to developers, we conducted a comparative evaluation with 68 professional software developers and an empirical study of compiler error messages found in Stack Overflow questions across seven different programming languages. Our findings suggest that, given a pair of error messages, developers significantly prefer the error message that employs proper argument structure over a deficient argument structure when neither offers a resolution---but will accept a deficient argument structure if it provides a resolution to the problem. Human-authored explanations on Stack Overflow converge to one of the three argument structures: those that provide a resolution to the error, simple arguments, and extended arguments that provide additional evidence for the problem. Finally, we contribute three practical design principles to inform the design and evaluation of compiler error messages. 
    more » « less
  5. Training deep neural networks can generate non-descriptive error messages or produce unusual output without any explicit errors at all. While experts rely on tacit knowledge to apply debugging strategies, non-experts lack the experience required to interpret model output and correct Deep Learning (DL) programs. In this work, we identify DL debugging heuristics and strategies used by experts, andIn this work, we categorize the types of errors novices run into when writing ML code, and map them onto opportunities where tools could help. We use them to guide the design of Umlaut. Umlaut checks DL program structure and model behavior against these heuristics; provides human-readable error messages to users; and annotates erroneous model output to facilitate error correction. Umlaut links code, model output, and tutorial-driven error messages in a single interface. We evaluated Umlaut in a study with 15 participants to determine its effectiveness in helping developers find and fix errors in their DL programs. Participants using Umlaut found and fixed significantly more bugs and were able to implement fixes for more bugs compared to a baseline condition. 
    more » « less