It has been well documented that a large portion of the cost of any software lies in the time spent by developers in understanding a program’s source code before any changes can be undertaken. One of the main contributors to software comprehension, by subsequent developers or by the authors themselves, has to do with the quality of the lexicon, (i.e., the identifiers and comments) that is used by developers to embed domain concepts and to communicate with their teammates. In fact, previous research shows that there is a positive correlation between the quality of identifiers and the quality of a software project. Results suggest that poor quality lexicon impairs program comprehension and consequently increases the effort that developers must spend to maintain the software. However, we do not yet know or have any empirical evidence, of the relationship between the quality of the lexicon and the cognitive load that developers experience when trying to understand a piece of software. Given the associated costs, there is a critical need to empirically characterize the impact of the quality of the lexicon on developers’ ability to comprehend a program. In this study, we explore the effect of poor source code lexicon and readability on developers’ cognitive load as measured by a cutting-edge and minimally invasive functional brain imaging technique called functional Near Infrared Spectroscopy (fNIRS). Additionally, while developers perform software comprehension tasks, we map cognitive load data to source code identifiers using an eye tracking device. Our results show that the presence of linguistic antipatterns in source code significantly increases the developers’ cognitive load.
more »
« less
Moving towards objective measures of program comprehension
Traditionally, program comprehension research relies heavily on indirect measures of comprehension, where subjects report on their own comprehension levels or summarize part of an artifact so that researchers can instead deduce the level of comprehension. However, there are several potential issues that can result from using these indirect measures because they are prone to participant biases and implicitly deduce comprehension based on various factors. The proposed research presents a framework to move towards more objective measures of program comprehension through the use of brain imaging and eye tracking technology. We aim to shed light on how the human brain processes comprehension tasks, specifically what aspects of the source code cause measurable increases in the cognitive load of developers in both bug localization tasks, as well as code reviews. We discuss the proposed methodology, preliminary results, and overall contributions of the work.
more »
« less
- Award ID(s):
- 1755995
- PAR ID:
- 10090362
- Date Published:
- Journal Name:
- Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE)
- Page Range / eLocation ID:
- 936 to 939
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
A large portion of the cost of any software lies in the time spent by developers in understanding a program’s source code before any changes can be undertaken. Measuring program comprehension is not a trivial task. In fact, different studies use self-reported and various psycho-physiological measures as proxies. In this research, we propose a methodology using functional Near Infrared Spectroscopy (fNIRS) and eye tracking devices as an objective measure of program comprehension that allows researchers to conduct studies in environments close to real world settings, at identifier level of granularity. We validate our methodology and apply it to study the impact of lexical, structural, and readability issues on developers’ cognitive load during bug localization tasks. Our study involves 25 undergraduate and graduate students and 21 metrics. Results show that the existence of lexical inconsistencies in the source code significantly increases the cognitive load experienced by participants not only on identifiers involved in the inconsistencies but also throughout the entire code snippet. We did not find statistical evidence that structural inconsistencies increase the average cognitive load that participants experience, however, both types of inconsistencies result in lower performance in terms of time and success rate. Finally, we observe that self-reported task difficulty, cognitive load, and fixation duration do not correlate and appear to be measuring different aspects of task difficulty.more » « less
-
Program comprehension is an important, but hard to measure cognitive process. This makes it difficult to provide suitable programming languages, tools, or coding conventions to support developers in their everyday work. Here, we explore whether functional magnetic resonance imaging (fMRI) is feasible for soundly measuring program comprehension. To this end, we observed 17 participants inside an fMRI scanner while they were comprehending source code. The results show a clear, distinct activation of five brain regions, which are related to working memory, attention, and language processing, which all fit well to our understanding of program comprehension. Furthermore, we found reduced activity in the default mode network, indicating the cognitive effort necessary for program comprehension. We also observed that familiarity with Java as underlying programming language reduced cognitive effort during program comprehension. To gain confidence in the results and the method, we replicated the study with 11 new participants and largely confirmed our findings. Our results encourage us and, hopefully, others to use fMRI to observe programmers and, in the long run, answer questions, such as: How should we train programmers? Can we train someone to become an excellent programmer? How effective are new languages and tools for program comprehension?more » « less
-
Code summarization is the task of creating short, natural language descriptions of source code. It is an important part of code comprehension and a powerful method of documentation. Previous work has made progress in identifying where programmers focus in code as they write their own summaries (i.e., Writing). However, there is currently a gap in studying programmers’ attention as they read code with pre-written summaries (i.e., Reading). As a result, it is currently unknown how these two forms of code comprehension compare: Reading and Writing. Also, there is a limited understanding of programmer attention with respect to program semantics. We address these shortcomings with a human eye-tracking study (n= 27) comparing Reading and Writing. We examined programmers’ attention with respect to fine-grained program semantics, including their attention sequences (i.e., scan paths). We find distinctions in programmer attention across the comprehension tasks, similarities in reading patterns between them, and differences mediated by demographic factors. This can help guide code comprehension in both computer science education and automated code summarization. Furthermore, we mapped programmers’ gaze data onto the Abstract Syntax Tree to explore another representation of human attention. We find that visual behavior on this structure is not always consistent with that on source code.more » « less
-
After decades of research, there is still no comprehensive, validated model of program comprehension. Recently, researchers have been applying psycho-physiological measures to expand our understanding of program comprehension. In this position paper, we argue that measuring program comprehension simultaneously with functional magnetic resonance imaging (fMRI) and eye tracking is promising. However, due to the different nature of both measures in terms of response delay and temporal resolution, we need to develop suitable tools. We describe the challenges of conjoint analysis of fMRI and eye-tracking data, and we also outline possible solution strategies.more » « less
An official website of the United States government

