This content will become publicly available on April 12, 2025
- PAR ID:
- 10543737
- Publisher / Repository:
- Association for Computing Machinery
- Date Published:
- ISBN:
- 9798400702174
- Subject(s) / Keyword(s):
- syntactic sugars, data-driven language design, subgraph mining
- Format(s):
- Medium: X
- Location:
- ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, Lisbon, Portugal
- Sponsoring Org:
- National Science Foundation
More Like this
-
With Kotlin becoming a viable language replacement for Java, there is a need for translators and data flow analysis libraries to create maintainable and readable source code. Instagram, Uber, and Gradle are only a few of the large corporations that have either switched from Java to Kotlin completely or started to use it in internal tools in order to reduce code base size. Developers have claimed that Kotlin is fun to use in comparison to Java and much of the boilerplate code is reduced. With Java being the main language for the open source organization, PhenoApps, there is a need to support both Java and Kotlin to increase the maintainability of the code. Fortunately, JetBrains has an open-source IDE plugin for translating Java to Kotlin; however, the translation has some fundamental issues which shall be discussed further in this paper. Introducing, j2k, a CLI translation tool which includes various anti-pattern detection for syntactical formatting, performance, and other Android requirements. The new tool introduced within this paper, j2kCLI allows users to directly translate strings of Java code to Kotlin, or entire directories. This facilitates the maintainability of a large open source code base.more » « less
-
null (Ed.)Polyglot programming, the use of multiple programming languages during the development process, is common practice in modern software development. This study investigates this practice through a randomized controlled trial conducted under the context of database programming. Participants in the study were given coding tasks written in Java and one of three SQL-like embedded languages. One was plain SQL in strings, one was in Java only, and the third was a hybrid embedded language that was closer to the host language. We recorded 109 valid data points. Results showed significant differences in how developers of different experience levels code using polyglot techniques. Notably, less experienced programmers wrote correct programs faster in the hybrid condition (frequent, but less severe, switches), while more experienced developers that already knew both languages performed better in traditional SQL (less frequent, but more complete, switches). The results indicate that the productivity impact of polyglot programming is complex and experience level dependent.more » « less
-
Abstract Source code is a form of human communication, albeit one where the information shared between the programmers reading and writing the code is constrained by the requirement that the code executes correctly. Programming languages are more syntactically constrained than natural languages, but they are also very expressive, allowing a great many different ways to express even very simple computations. Still, code written by developers is highly predictable, and many programming tools have taken advantage of this phenomenon, relying on language model
surprisal as a guiding mechanism. While surprisal has been validated as a measure of cognitive load in natural language, its relation to human cognitive processes in code is still poorly understood. In this paper, we explore the relationship between surprisal and programmer preference at a small granularity—do programmers prefer more predictable expressions in code? Usingmeaning‐preserving transformations , we produce equivalent alternatives to developer‐written code expressions and run a corpus study on Java and Python projects. In general, language models rate the code expressions developerschoose to write as more predictable than these transformed alternatives. Then, we perform two human subject studies asking participants to choose between two equivalent snippets of Java code with different surprisal scores (one original and transformed). We find that programmersdo prefer more predictable variants, and that stronger language models like the transformer align more often and more consistently with these preferences. -
Worked examples (solutions to typical programming problems presented as a source code in a certain language and are used to explain the topics from a programming class) are among the most popular types of learning content in programming classes. Most approaches and tools for presenting these examples to students are based on line-by-line explanations of the example code. However, instructors rarely have time to provide line-by-line explanations for a large number of examples typically used in a programming class. In this paper, we explore and assess a human-AI collaboration approach to authoring worked examples for Java programming. We introduce an authoring system for creating Java worked examples that generates a starting version of code explanations and presents it to the instructor to edit if necessary. We also present a study that assesses the quality of explanations created with this approach.more » « less
-
Comments are an integral part of software development; they are natural language descriptions associated with source code elements. Understanding explicit associations can be useful in improving code comprehensibility and maintaining the consistency between code and comments. As an initial step towards this larger goal, we address the task of associating entities in Javadoc comments with elements in Java source code. We propose an approach for automatically extracting supervised data using revision histories of open source projects and present a manually annotated evaluation dataset for this task. We develop a binary classifier and a sequence labeling model by crafting a rich feature set which encompasses various aspects of code, comments, and the relationships between them. Experiments show that our systems outperform several baselines learning from the proposed supervision.more » « less