Platforms like Stack Overflow and GitHub's gist system promote the sharing of ideas and programming techniques via the distribution of code snippets designed to illustrate particular tasks. Python, a popular and fast-growing programming language, sees heavy use on both sites, with nearly one million questions asked on Stack Overflow and 400 thousand public gists on GitHub. Unfortunately, around 75% of the Python example code shared through these sites cannot be directly executed. When run in a clean environment, over 50% of public Python gists fail due to an import error for a missing library.
We present DockerizeMe, a technique for inferring the dependencies needed to execute a Python code snippet without import error. DockerizeMe starts with offline knowledge acquisition of the resources and dependencies for popular Python packages from the Python Package Index (PyPI). It then builds Docker specifications using a graph-based inference procedure. Our inference procedure resolves import errors in 892 out of nearly 3,000 gists from the Gistable dataset for which Gistable's baseline approach could not find and install all dependencies.
more »
« less
Gistable: Evaluating the Executability of Python Code Snippets on GitHub
Software developers create and share code online to demonstrate programming language concepts and programming tasks. Code snippets can be a useful way to explain and demonstrate a programming concept, but may not always be directly executable. A code snippet can contain parse errors, or fail to execute if the environment contains unmet dependencies. This paper presents an empirical analysis of the executable status of Python code snippets shared through the GitHub gist system, and the ability of developers familiar with software configuration to correctly configure and run them. We find that 75.6% of gists require non-trivial configuration to overcome missing dependencies, configuration files, reliance on a specific operating system, or some other environment configuration. Our study also suggests the natural assumption developers make about resource names when resolving configuration errors is correct less than half the time. We also present Gistable, a database and extensible framework built on GitHub's gist system, which provides executable code snippets to enable reproducible studies in software engineering. Gistable contains 10,259 code snippets, approximately 5,000 with a Dockerfile to configure and execute them without import error. Gistable is publicly available at https://github.com/gistable/gistable.
more »
« less
- Award ID(s):
- 1814798
- PAR ID:
- 10084518
- Date Published:
- Journal Name:
- IEEE International Conference on Software Maintenance and Evolution (ICSME)
- Page Range / eLocation ID:
- 217 to 227
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The typical software tutorial includes step-by-step instructions for installing developer tools, editing files and code, and running commands. When these software tutorials are not executable, either due to missing instructions, ambiguous steps, or simply broken commands, their value is diminished. Non-executable tutorials impact developers in several ways, including frustrating learning experiences, and limiting usability of developer tools. To understand to what extent software tutorials are executable---and why they may fail---we conduct an empirical study on over 600 tutorials, including nearly 15,000 code blocks. We find a naive execution strategy achieves an overall executability rate of only 26%. Even a human-annotation-based execution strategy---while doubling executability---still yields no tutorial that can successfully execute all steps. We identify several common executability barriers, ranging from potentially innocuous causes, such as interactive prompts requiring human responses, to insidious errors, such as missing steps and inaccessible resources. We validate our findings with major stakeholders in technical documentation and discuss possible strategies for improving software tutorials, such as providing accessible alternatives for tutorial takers, and investing in automated tutorial testing to ensure continuous quality of software tutorials.more » « less
-
Code snippets are prevalent, but are hard to reuse because they often lack an accompanying environment configuration. Most are not actively maintained, allowing for drift between the most recent possible configuration and the code snippet as the snippet becomes out-of-date over time. Recent work has identified the problem of validating and detecting out-of-date code snippets as the most important consideration for code reuse. However, determining if a snippet is correct, but simply out-of-date, is a non-trivial task. In the best case, breaking changes are well documented, allowing developers to manually determine when a code snippet contains an out-of-date API usage. In the worst case, determining if and when a breaking change was made requires an exhaustive search through previous dependency versions. We present V2, a strategy for determining if a code snippet is out-of-date by detecting discrete instances of configuration drift, where the snippet uses an API which has since undergone a breaking change. Each instance of configuration drift is classified by a failure encountered during validation and a configuration patch, consisting of dependency version changes, which fixes the underlying fault. V2 uses feedback-directed search to explore the possible configuration space for a code snippet, reducing the number of potential environment configurations that need to be validated. When run on a corpus of public Python snippets from prior research, V2 identifies 248 instances of configuration drift.more » « less
-
Data sets have grown exponentially in size, rapidly surpassing the scale at which traditional exploratory data analysis (EDA) tools can be used effectively to analyze real-world graphs. This led to the development of Arachne, a user-friendly tool enabling interactive graph analysis at terabyte scales while using familiar Python code and utilizing a high-performance back-end powered by Chapel that can be run on nearly any *nix-like system. Various disciplines, including biological, information, and social sciences, use large-scale graphs to represent the flow of information through a cell, connections between neurons, interactions between computers, relationships between individuals, etc. To take advantage of Arachne, however, a new user has to go through a long and convoluted installation process, which often takes a week or more to complete, even with assistance from the developers. To support Arachne’s mission of being an easy-to-use exploratory graph analytics tool that increases accessibility to high performance computing (HPC) resources, a better deployment experience was needed for users and developers. In this paper, we propose a tool specially designed to greatly simplify the deployment of Arachne for users and offer the ability to rapidly and automatically test the software for compatibility with new releases of its dependencies. The highly portable nature of Arachne necessitates that this deployment tool be able to install and configure the software in diverse combinations of hardware, operating system, initial system environment, and the evolving packages and libraries in Arachne. The tool was tested in both virtual and real-world environments, where its success was evaluated by an improvement to efficiency and productivity by both users and developers. Current results show that the installation and configuration process was greatly improved, with a significant reduction in the time and effort spent by both users and developers.more » « less
-
Continuous Integration (CI) allows developers to check whether their code can build successfully and pass tests across various system environments with every commit. To use a CI platform, a developer must provide configuration files within a code repository to specify build conditions. Incorrect configuration settings lead to CI build failures, which can take hours to run, wasting valuable developer time and delaying product release dates. Debugging CI configurations is a slow and error-prone process. The only way to check the correctness of CI configurations is to push a commit and wait for the build result. We present VeriCI, the first system for localizing CI configuration errors at the code level. VeriCI runs as a static analysis tool, before the developer sends the build request to the CI server. Our key insight is that the commit history and the corresponding build histories available in CI environments can be used both for build error prediction and build error localization. We leverage the build history as a labeled dataset to automatically derive customized rules describing correct CI configurations, using supervised machine learning techniques. To more accurately identify root causes, we train a neural network that filters out constraints that are less likely to be connected to the root cause of build failure. We evaluate VeriCI on real world data from GitHub and achieve 91% accuracy of predicting a build failure and correctly identify the root cause in 75% of cases. We also conducted a between-subjects user study with 20 software developers, showing that VeriCI significantly helps users in identifying and fixing errors in CI.more » « less