skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Theory of Robust API Knowledge
Creating modern software inevitably requires using application programming interfaces (APIs). While software developers can sometimes use APIs by simply copying and pasting code examples, a lack of robust knowledge of how an API works can lead to defects, complicate software maintenance, and limit what someone can express with an API. Prior work has uncovered the many ways that API documentation fails to be helpful, though rarely describes precisely why. We present a theory of robust API knowledge that attempts to explain why, arguing that effective understanding and use of APIs depends on three components of knowledge: (1) the domain concepts the API models along with terminology, (2) the usage patterns of APIs along with rationale, and (3) facts about an API’s execution to support reasoning about its runtime behavior. We derive five hypotheses from this theory and present a study to test them. Our study investigated the effect of having access to these components of knowledge, finding that while learners requested these three components of knowledge when they were not available, whether the knowledge helped the learner use or understand the API depended on the tasks and likely the relevance and quality of the specific information provided. The theory and our evidence in support of its claims have implications for what content API documentation, tutorials, and instruction should contain and the importance of giving the right information at the right time, as well as what information API tools should compute, and even how APIs should be designed. Future work is necessary to both further test and refine the theory, as well as exploit its ideas for better instructional design.  more » « less
Award ID(s):
1703304
PAR ID:
10287566
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
ACM Transactions on Computing Education
Volume:
21
Issue:
1
ISSN:
1946-6226
Page Range / eLocation ID:
1 to 32
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Modern software development requires developers to find and effectively utilize new APIs and their documentation, but documentation has many well-known issues. Despite this, developers eventually overcome these issues but have no way of sharing what they learned. We investigate sharing this documentation-specific information through annotations, which have advantages over developer forums as the information is contextualized, not disruptive, and is short, thus easy to author. Developers can also author annotations to support their own comprehension. In order to support the documentation usage behaviors we found, we built the Adamite annotation tool, which provides features such as multiple anchors, annotation types, and pinning. In our user study, we found that developers are able to create annotations that are useful to themselves and are able to utilize annotations created by other developers when learning a new API, with readers of the annotations completing 67% more of the task, on average, than the baseline. 
    more » « less
  2. Online tutorials are a valuable source of community created information used by numerous developers to learn new APIs and techniques. Once written, tutorials are rarely actively curated and can become dated over time. Tutorials often reference APIs that change rapidly, and deprecated classes, methods and fields can render tutorials inapplicable to newer releases of the API.Newer tutorials may not be compatible with older APIs that are still in use. In this paper, we first empirically study the tutorial versioning problem, confirming its presence in popular tutorials on the Web. We subsequently propose a technique, based on similar techniques in the literature, for automatically detecting the applicable API version ranges of tutorials, given access to the official API documentation they reference. The proposed technique identifies each API mention in a tutorial and maps the mention to the corresponding API element in the official documentation. The version of the tutorial is determined by combining the version ranges of all of the constituent API mentions. Our technique’s precision varies from 61% to 89% and recall varies from 42% to 84% based on different levels of granularity of API mentions and different problem constraints. We observe API methods are the most challenging to accurately disambiguate due to method overloading. As the API mentions in tutorials are often redundant, and each mention of a specific API element commonly occurs several times in a tutorial, the distance of the predicted version range from the true version range is low; 3.61 on average for the tutorials in our sample. 
    more » « less
  3. Online tutorials are a valuable source of community created information used by numerous developers to learn new APIs and techniques. Once written, tutorials are rarely actively curated and can become dated over time. Tutorials often reference APIs that change rapidly, and deprecated classes, methods and fields can render tutorials inapplicable to newer releases of the API.Newer tutorials may not be compatible with older APIs that are still in use. In this paper, we first empirically study the tutorial versioning problem, confirming its presence in popular tutorials on the Web. We subsequently propose a technique, based on similar techniques in the literature, for automatically detecting the applicable API version ranges of tutorials, given access to the official API documentation they reference. The proposed technique identifies each API mention in a tutorial and maps the mention to the corresponding API element in the official documentation. The version of the tutorial is determined by combining the version ranges of all of the constituent API mentions. Our technique’s precision varies from 61% to 89% and recall varies from 42% to 84% based on different levels of granularity of API mentions and different problem constraints. We observe API methods are the most challenging to accurately disambiguate due to method overloading. As the API mentions in tutorials are often redundant, and each mention of a specific API element commonly occurs several times in a tutorial, the distance of the predicted version range from the true version range is low; 3.61 on average for the tutorials in our sample. 
    more » « less
  4. APIs are becoming the fundamental building block of modern software and their usability is crucial to programming efficiency and software quality. Yet API designers find it hard to gather and interpret user feedback on their APIs. To close the gap, we interviewed 23 API designers from 6 companies and 11 open-source projects to understand their practices and needs. The primary way of gathering user feedback is through bug reports and peer reviews, as formal usability testing is prohibitively expensive to conduct in practice. Participants expressed a strong desire to gather real-world use cases and understand users' mental models, but there was a lack of tool support for such needs. In particular, participants were curious about where users got stuck, their workarounds, common mistakes, and unanticipated corner cases. We highlight several opportunities to address those unmet needs, including developing new mechanisms that systematically elicit users' mental models, building mining frameworks that identify recurring patterns beyond shallow statistics about API usage, and exploring alternative design choices made in similar libraries. 
    more » « less
  5. With the rise of software-as-a-service and microservice architectures, RESTful APIs are now ubiquitous in mobile and web applications. A service can have tens or hundreds of API methods, making it a challenge for programmers to find the right combination of methods to solve their task. We present APIphany, a component-based synthesizer for programs that compose calls to RESTful APIs. The main innovation behind APIphany is the use of precise semantic types, both to specify user intent and to direct the search. APIphany contributes three novel mechanisms to overcome challenges in adapting component-based synthesis to the REST domain: (1) a type inference algorithm for augmenting REST specifications with semantic types; (2) an efficient synthesis technique for “wrangling” semi-structured data, which is commonly required in working with RESTful APIs; and (3) a new form of simulated execution to avoid executing APIs calls during synthesis. We evaluate APIphany on three real-world APIs and 32 tasks extracted from GitHub repositories and StackOverflow. In our experiments, APIphany found correct solutions to 29 tasks, with 23 of them reported among top ten synthesis results. 
    more » « less